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Field of the invention 

The present invention is based on the finding that two fimbria! operoris, the saf 
operon and the tcf operon, are specific for Salmonella enterica subspecies 1 
bacteria and therefore have therapeutic use, »pue to their specificity |hey can be - 
5 used to provide vaccines against Salmonella ervlerica subspqeies I as well as for 
detection of Salmonella -enterica subspecies I. The sa/operon is specific for all 
Salmonella enterica subspecies 1 bacteria and the tcf operon is specific for the 
serovar Typhi of Salmonella enterica subspecies 1. 

10 All or part of the DNA-sequences of the genes encoding these proteins can be 
used as active agents in a vaccine against diseases caused by the Salmonella 
enterica subspecies I bacterial strains or for detection of said bacterial strains. 

The present invention also relates to methods of isolating these fimbrial 
15 proteins, to antibodies directed against these proteins, and to a vaccine 
composition comprising these proteins or antibodies directed against these 
proteins for use in the treatment of : infeotions caused by the Salmonella spp. 
The fimbrial proteins according to the-invention or antibodies directed against 
them can be used for detection of Salmonella spp.bRctmei^ 

20 

Background of the invention 

The members of genus Salmonella spp colonize and infect a wide*ange of 
different organisms. Many cause gastroenteritis and enteric^fever in humans 
- and domesticated animals while others are not associated with ^human disease 

25 (Saylers et al, 1994). The genus has been divided into two species, Salmonella 

bongori and Salmonella enterica where enterica can be further subdivided into (J) 
seven subspecies, designated I, II, Ela, nib, IV, VI, and VII (Reeves et al, 1989). jJJ 
Salmonella enterica subspecies fare preferentially associated with warm- H 
blooded animals. Over 99% of all clinical SaJmonella isolates are strains ^ 

30 belonging to this subspecies, including serovars Typhimurium and Enteritidis, ^ 
which are the major causes of Salmonella induced gastroenteritis in humans, ^ 
and Typhi, the human specific causative organism of typhoid fever, the most QJ 
severe form of human salmonellosis (Popsff ,6t ^alf 1992p> 5TI 

Q 

35 SalmonellmenteTicxmubsp^e^ consists of over 1300-dilfc*e^semvttrs and is O 
preferentiaHy-assosfa^^ ^ 
99% of all clinical Salmonella isolate are strains belonging to this subspecies, 
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including serovars Typhimurium and Enteritidis, which are the major causes of 
Salmonella induced gastroenteritis in humans, and Typhi, the human specific 
causative organism of typhoid fever, the most severe form of human 
salmonellosis (Popoff and Le Minor, 1992). 

5 

Today gastroenteritis and enteric fever can neither be prevented nor treated 
with good results. Typhoid fever is a substantial public health problem in 
developing countries. Each year 33 million people become ill and over 500 000 
people die from this infection (American Institute of Medicine, 1986). Typhoid 

10 fever can be prevented by vaccination with attenuated bacteria, such as Ty21 
and Vi vaccines and whole cell vaccines. Whole cell vaccines show a high 
incidence of side effects (Ashcroft et al, 1964, Yugoslav Typhoid commission, 
1964). The vaccines consisting of attenuated strains of Salmonella typhi suffer 
from serious drawbacks. They must be administered as three or four spaced 

15 doses in order to stimulate protective immune responses (Levine et al, 1989). 
The treatment of Salmonella typhi with antibiotics is jeopardized since there 
are strains of Salmonella typhi that are resistant to chloramphenicol, 
ampicillin, and trimethoprim as well as ciprofloxacin (i.e. multidrug-resistant 
strains) (Rowe et al, 1997). 

20 

Accurate detection of Salmonella enterica subspecies I is today not possible. 
Salmonella enterica subspecies I can today only be detected by antibodies 
directed against surface proteins of Salmonella enterica subspecies I. The use of 
the sequences according to the invention makes it for the first time possible to 
25 rapidly and accurately determine the presence of Salmonella enterica 
subspecies I. 

For many pathogenic bacteria, there is evidence that the filamentous surface 
protein structures called pili (fimbriae) are connected to the adhesion of the 
30 bacteria to the host cells. Pili proteins are very antigenic and are easily 
purified. Therefore pili preparations have been used as antigens for 
vaccination. 

Summary of the invention 
35 The invention relates to the objects as defined in the claims. The main object of 
the present invention is to provide two fimbrial proteins that are specific for 
Salmonella enterica subspecies I bacterial strains, the nucleotide sequences 
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encoding said proteins, as well as the corresponding amino acid sequences of 
for therapeutic and diagnostic use. Further are recombinant microorganisms 
provided, in which the nucleotide sequences according to the invention have 
been inserted. 

5 

An object of the present invention is to provide vaccine compositions for Use in 
the treatment of Salmonella enterica infective strains, essentially pure Saf and 
Tcf fill protein of Salmonella enterica subspecies I and Salmonella enterica 
subspecies I serovar Typhi, respectively, as well as antibodies directed to these 
10 fili proteins. 

A further object of the present invention is to provide the DNA sequences of the 
genes encoding the Saf and Tcf proteins. These sequences can be used for 
recombinant production of the proteins and for the preparations of vector 
1 5 vaccines against Salmonella enterica subspecies 1 and Salmonella enterica 
subspecies 1 serovar Typhi, respectively, as well as for diagnostic purposes. 

Yet another object of the present invention to use purified Saf and Tcf protein 
from Salmonella enterica subspecies 1 bacteria for active or passive 
20 immunization of mammals, i.e. the proteins according to the invention can be 
comprised in a vaccine composition or be used to raise antibodies which can be 
comprised in a vaccine composition. 

Finally, an object of the present invention is to provide a method for preventing 
25 or reducing the possibility of Salmonella infection of a mammal by 

adriiinistering the vaccines according to the invention. The invention may be 
more fully understood by reference to the following drawings and detailed 
description. 

30 Brief description of the drawings 
Figure 1. 

Schematic representation of phage clones (named N10, Dl, Bl, Fll) covering 
the entire cs7 insert of Salmonella enterica serovar Typhimurium strain 
SR^3181, i.e, comprising the sa/fimbrial operon, i.e. saf A, B, C and D 
35 (SEQ ID NO 1). 

The clones were selected from partial Eco RI and BamHl libraries in the 
Lambda Dash II vector. The cs7 insert is represented by a bold line. The extent 
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of respective phage insert is represented by horizontal bars. Name and size of 
the phage inserts are indicated on the left side of the figure. 
Figure 2. 

Schematic representation of the pTY52 cosmid comprising the te^operon (SEQ 
5 ID NO 2). 

A tcf specific PCR fragment of 1 1 1 05 bp was cloned into the Expand vector I 
cosmid (Roche). The insert is represented with a thick black line while vector 
sequences are represented with thin lines. Relevant restriction sites sequences 
are indicated. The position of the tc/-operon, i.e. tcf A, B, Cand D (SEQ ID NO 
10 2), is represented by a shaded arrow. 
Figure 3. 

The phylogenetic distribution of the identified genes on the cs7 insert was 
investigated using the well defined SARC collection, see Example 1 . 
Figure 4. 

15 A 2 kb large internal EcoR I fragment was used as a probe in a Southern blot of 
the SARC collection, see Example 2. 

Sequence listing 

SEQ ID NO 1 — DNA sequence of the genes encoding the precursor of the saf 
20 fimbrie unit of Salmonella enterica subspecies I. 

SEQ ID NO 2 — DNA sequence of the genes which encode the precursor of the 
tcf fimbrie unit of Salmonella enterica subspecies I serovar Typhi. 

Deposit information 

25 The phages carrying the inserted SEQ ID NO 1, i.e. phages clones Bl, Dl, Fl 1 
and N10 (see Figure 1) have been given the ECACC Accession numbers 
99051922, 99051923, 99051924, and 99051925, respectively. 
The cosmide carrying the inserted SEQ ID NO 2, i.e. cosmide pTY52 (see Figure 
2) has been given the ECACC Accession number 99051926. 

30 The depositions were made May 19, 1999. 

Detailed description of the invention 

The present invention is based on the finding that two fimbrial operons, the saf 
operon and the tcf operon, are specific for Salmonella enterica subspecies 1 
35 bacteria. Due to their specificity they can be used to provide vaccines against 
Salmonella enterica subspecies I as well as detection methods for Salmonella 
enterica subspecies I. The saf operon is specific for all Salmonella enterica 
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subspecies 1 bacteria and the tcf operon is specific for the serovar typhi of 
Salmonella enterica subspecies 1, see Examples 1 & 2. 

The main object of the invention relates to two fimbria! operons, the saf operon 
5 and the tcf operon, that are specific for Salmonella enterica subspecies 1 
bacteria for terapeutic use. 

Another object of the present invention is to provide vaccines against 
Salmonella enterica subspecies 1 induced gastroentritis, entric fever and 
10 typhoid fever. 

A further object of the present invention is to provide methods to detect 
Salmonella enterica subspecies 1 . The nucleotide sequences according to the 
invention are useful for constructing vectors for use as vaccines for insertion 

15 into attenuated bacteria in constructing a recombinant vaccine, for insertion 
into a viral vector in constructing a recombinant viral vaccine, or for direct 
inoculation as a nucleic acid vaccine. The pili proteins according to the 
invention, or antigenic fragments thereof, can be used for active immunization 
and antibodies directed against them can be used for passive immunization. All 

20 these applications of the sequences according to the invention are obtained by 
applying standard techniques known to the man ordinary skilled in the art. 

Vaccines against Salmonella enterica subspecies I. 

The genes encoding the saf and tcf fimbrial structures, or fragments thereof, 
25 may be incorporated into a bacterial or viral vaccine comprising recombinant 
bacteria, virus or fungi which are engineered to produce one or more 
immunogenic epitopes of the saf or tcf fimbrial structures. In addition, the 
genes encoding the saf and tcf fimbrial structures, or part thereof, operatively 
linked to regulatory elements, can be introduced directly as a nucleic acid 
30 vaccine, to elicit a protective immune response. 

The proteins or antigenic fragment thereof, deduced from the nucleic acid 
sequences of the present invention are useful alone or in conventional vaccine 
mixtures in the vaccine compositions according to the invention. The proteins 
35 could be produced by chemical synthesis or recombinant expression according 
to conventional methods. 
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The proteins and peptides according to the invention can be obtained by using 
a host organism transformed or transfected with an expression vector obtained 
by insertion of a gene according to the invention, or part thereof, into a vector 
in a conventional manner. The vector which is used to construct the expression 
5 vector is not particularly limited, but specific examples include plasmids such 
as pET (Stratagen) and the like; and phages such as M13 (NEB), phage display 
libraries and the like. As expression regulatory sequence can among others T7 
promoters and lac promotors be used. 

10 An appropriate host to be transformed or transfected with the expression vector 
can be chosen among for example E.-coli, Salmonella or Bacillus subtilus. The 
transformed or transfected host is cultured and proliferated under suitable 
conditions. 

15 After culturing, the peptides of the present invention may be purified by, for 
example, chromatography, precipitation, and /or density gradient 
centrifugation. The thus obtained peptides can be used as a vaccine or for the 
production of antibodies directed against said peptides, which can be used for 
passive immunization. 

20 

The purified preparation containing one or several proteins according to the 
invention, or parts thereof, is then formulated as a pharmaceutical 
composition, as for example a vaccine, or in a mixture with adjuvants. If 
desired the proteins are fragmented by standard chemical or enzymatic 
25 techniques to produce antigenic segments. 

In formulating the vaccine compositions with the peptide or protein, alone or in 
various combinations, the immunogen is adjusted to an appropriate 
concentration and formulated with any suitable vaccine adjuvant. The 
30 immunogen may also be incorporated into liposomes, or conjugated to 
polysaccharides and/ or other polymers for use in a vaccine formulation. 

The different vaccines according to the present invention are administered to 
mammals in many different ways. These include intradermal, intramuscular, 
35 intraperitoneal, intravenous, subcutaneous, oral, and intranasal routes of 

administration. The vaccine doses will differ depending on circumstances such 
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as body weight, interferences with other administered medicaments etc. The 
upper limit is not critical unless the dose shows toxicity. 

The peptides and proteins of the present invention are also useful to produce 
5 monoclonal or polyclonal antibodies for use in passive immunotherapy against 
Salmonella enterica subspecies 1 . Human immunoglobulin is preferred. 
Antisera is obtained from individuals immunized with proteins or peptides 
according to the invention. The immunoglobulin fraction is then enriched, for 
example by immunoaffininty or affininty chromatography. Antibodies raised in 
10 a suitable mammal or in the patient to be treated, can subsequently be 
administered locally or topically, e.g. orally to the patient. 

Detection of Salmonella enterica subspecies I in general. 
The sequences according to the invention, or part thereof, or fragments 
15 hybridizing therewith, as well as the proteins according to the invention, or part 
thereof, and antibodies directed to said proteins, or antigenic fragments 
thereof, can be used in molecular diagnostic assays for the detection of 
Salmonell enterica subspecies I. 

20 Nucleic acids having the nucleotide sequence according to the invention, or any 
nucleotide sequence hybridizing therewith can be used as a probe in nucleic 
acid hybridization assays for the detection of Salmonella spp in various tissues 
and body fluids of patients. The hybridization assay may be of any type 
including; Southern blots, Northern blots, colony blots. 

25 

PCR technology is the most preferred technology for detection according to the 
invention of Salmonella enterica subspecies 1 . Primers of at least one selected 
from the 5' end and one from the 3' end can be used in PCR and other known 
tests to rapidly identify the presence of Salmonella enterica subspecies 1 . This is 
30 according to conventional techniques. 

The isolated and purified proteins and peptides of the invention can be used as 
diagnostics to measure an increase in serum titer of Salmonella enterica 
subspecies I-specific antibody since they bind strongly to these antibodies. A 
35 serum test sample can be screened for Salmonella enterica subspecies I by 
methods such as for example ELISA. 
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The invention further comprises the use of antibodies directed against the saf 
and tcf fimbrie structures for quantitative or qualitative determinations of the 
pili proteins of the invention, or fractions thereof, in cells, tissues or body 
fluids. 



hybridization technology 

Nucleic acid hybridization technology can also be to detect Salmonella enterica 
subspecies 1 according to the invention. The nucleic acid probes chosen from 
10 parts of the sequences according to the invention can be either DNA or RNA. 



can also be used. The binding of the probe to the target sequence, i.e. the 
hybridization, must not be perfect. Variations and mutations of the sequences 



15 to detect Salmonella enterica subspecies I. The preferred length of the nucleic 
acid probes is about 10 to 400 nucleotides, most preferred not longer than 100 
nucleotides. 

The nucleotide probe is preferably chosen from the parts of the sequences that 
20 have the least variation. In the most preferred embodiments when screening for 
SEQ ID NO 1 (the saf operon, specific for Salmonella enterica subspecies 1) a 
nucleotide probe or PCR primer selected from nucleotides 37 368-37 868 
should be avoided since this region is hypervariable. 

25 The nucleic acid probes according to the invention are prepared by any 

conventional method such as organic synthesis, recombinant techniques, or 
isolation from genomic DNA. 

The nucleic acid probes of the invention are labeled in a conventional manner 
30 to signal hybridization to target nucleic acid from Salmonella enterica 

subspecies I. The labeling may comprise a radiolabel, an enzyme, a bacterial 
label, a fluorescent label, an antibody, an antigen, a latex particle, an electron 
dense compound, or a light scattering particle. 
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Detection of Salmonella enterica subspecies I by using nucleic acid 



DNA sequences complementary to the sequences according to the invention 




35 The probes may be provided in a lyophilized form, to be reconstituted in a 

buffer appropriate for hybridization, or the probes may already be present in 
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such a buffer. The buffer may contain a suitable hybridization enhancer, 
detergent, carrier DNA, and a compound to increase the specificity. 

Any conventional hybridization assay technique, such as dot blot hybridization, 
5 Southern blotting, sandwich hybridization, displacement hybridization and the 
like, can be used. 

The target analyte polynucleotide of a microorganism may be in various media, 
most often in a biological, or physiological specimen. In most cases it is 
10 preferred to subject the specimen containing the target polynucleotide to any 
conventional extraction, purification, and /or isolation before conducting the 
analysis. 

The sample containing the target analyte nucleotide sequence must often be 
15 treated to convert the DNA to a single- stranded form, which may be 

accomplished by a variety of conventional techniques, such as thermal or 
chemical techniques. 

The following examples describe the isolation and specificity of the sequences 
20 according to the invention. 

EXAMPLE 1 

Identification and characterization of the sa/operon. 

The present inventors found, upon investigation of a 7 kb chromosomal region 
25 on centisome 7 originally isolated from the S. typhimurium strain SR-1 1 k3181, 
a region that exhibits many of the traits that define a pathogenicity island. It 
has a lower G+C composition than the average composition of the Salmonella 
genome and includes many sequences related to different mobile genetic 
elements. The region is not present in E.coliK 12, and the Salmonella specific 
30 DNA is inserted between the tRNA gene asp V and the stop codon of yafV t a 
hypothetical protein upstream of the yafH gene at 5 min in the E.coli 
chromosome. This Salmonella specific insert encodes proteins creating adhesive 
structures and other virulence factors. Sequencing revealed genes encoding a 
new fimbrial operon that they designated Salmonella Atypical Fimbriae (sq/J, 
35 due to its relatedness to a subgroup of adhesive structures forming thin 
atypical fimbriae or non-fimbrial adhesins. 
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The saf operon consists of four contiguous genes, safA, safB, safC and safD 
that encode fimbria! subunit, periplasmic chaperone, outer membrane usher 
protein and alternative fimbrial subunit, respectively. The genes saf A, B, C and 
safD encode putative proteins of 166, 244, 836 and 156 amino acids, 
respectively. Analyzes of clinical Salmonella isolates showed that DNA of 195 
out of 198 clinical isolates belonging to S. enterica subspecies I hybridized with 
safB and safC, i.e. these sequences are common to more than 99% of the 
known Salmonella enterica subspecies 1 bacteria. The inventors showed that 
58% of these clinical isolates carry the safA, see Table 1 . 



ll 

Table 1 . The prevalence of the saf genes in clinical Salmonella isolates. 



Serovar 


safA 


saJB 


safC 


# isolates 


S. adelaide 


- 


+ 


+ 


1 


S. agona 


+ 


+ 


+ 


6 


S. anatum 


- 


+ 


+ 


3 


S. bareilly 


+ 


+ 


+ 


3 


S. blockley 


+ 




+ 


3 


S. bovismorbificans 


- 


+ 


+ 


5 


S. braenderup 


- 


+ 


+ 


4 


S. bvxxTxdeTxijUTQ 


+ 


+ 


+ 


1 


S. bredeney 


+/- 




+ 


15 


S. Chester 


+ 


+ 


+ 


1 


S. colindale 


- 


+ 


+ 


1 


S. derby 




+ 


+ 


1 


S. dublin 


- 


+ 


+ 


1 


S. eastboume 


+ 


+ 


+ 


2 


S. emek 


+ 


+ 


+ 


1 


S. enteritidis 


- 


+ 


+ 


8 


£ give 


- 


+ 


+ 


1 


S. goettingen 


+ 


+ 


+ 


1 


S. haardt 


- 




+ 


1 


S. hadar 


+ 


+ 


+ 


16 


S. Heidelberg 


- 


+ 


+ 


1 


S. hvittingfoss 


+ 


+ 


+ 


5 


S. infantis 




+ 


+ 


6 


Sjava 


- 


+ 


+ 


1 


S. jaiAana 


- 


+ 


+ 


1 


S. kottbus 


- 


+ 


+ 


1 


S. Hvingstone 


- 




+ 


1 


S. tondon 


+ 


+ 


+ 


1 


S. maastricht 


+ 


+ 


+ 


2 


S. mbandaka 


- 


- 


- 


3 


S. montevideo 


+ 


+ 


+ 


1 


S. muenster 


- 


+ 


+ 


1 


S. newport 


+ 


+ 


+ 


2 


S. 0H0 


+ 


+ 


+ 


1 


S. omnienburg 


+ 


+ 


+ 


2 


S. panama 


+ 


+ 


+ 


3 


S. potsdam 


+ 


+ 


+ 


1 


S. rissen 


- 


- 


- 


1 


S. saarbrucken 


- 


+ 




1 


S. saint paul 


+ 


+ 


+ 


3 


S. schwartzengrund 


- 


+ 


+ 


1 


S. Singapore 


+ 


+ 


+ 


1 


S. Stanley 


+ 




+ 


5 


S. subsp 1 4.5,12:-:- 


+ 


+ 


+ 


2 


S. subsp 14,5,1 2:b:- 


- 


+ 


+ 


1 


S. subsp 14.5, 12±- 


+ 


+ 


+ 


1 


S. subsp I spont 




+ 


+ 


1 


S. tennessee 


+ 


+ 


+ 


2 


S. thompson 






+ 


1 


S. typhi 




+ 


+ 


1 


S. typhimurium 


+ 


+ 


+ 


27 


S. inrchout 


+ 


+ 




7 


S. iveltervreden 




+ 


+ 


1 


S. ujorthington 








2 


S. subsp m 
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The phylogenetic distribution of the identified genes on the cs7 insert was 
investigated using the well defined SARC collection, which showed that the 
presence of the safA, safB, safC and safD genes is restricted to S. enterica 
5 subspecies I (Fig. 3). This region is hence the first subspecies I specific genetic 
region to be identified with a broad distribution within the subspecies. Since 
the serovars of subspecies I constitute over 99% of human salmonellosis and 
are preferentially associated with warm blooded animals, it implicates a role for 
the saf adhesive organelle in the colonization of these organisms. 

10 

EXAMPLE 2 

Identification and characterization of the te/operon. 

The present inventors found that Salmonella enterica subspecies I serovar Typhi 

i ., t-v n> t * _ . ) JU: t , l_ „ 1 ^ _~ .~ AMAvn« +Vi*a 

15 sinR-pagN intergenic region. Southern blot analysis revealed a markedly 
different restriction pattern in S. enterica serovar Typhi than the other 
subspecies I isolates, suggesting that the saf-sin region in serovar Typhi might 
carry additional DNA relative to serovar Typhimurium strains. A PCR reaction 
(using a kit from Roche) was therefore performed using a sinR (5'-GTA AAT 

20 CGC TTA GTC GCC-3') specific forward primer and a pagN (5'-TCA ACT CAA 
CCT TCA GCC-3') specific reverse primer. 

This primer pair produced, as expected, a product of 2 kb in serovar 
Typhimurium from the SARC collection, while from serovar Typhi the product 
25 was 10 kb. Thus, the neighboring sinR and pagN genes in serovar 

Typhimurium strains are separated by approximately 8 kb in serovar Typhi. 

The Typhi specific PCR product was purified, digested partially with EcoRI and 
sub-cloned into pUC18 forming a set of overlapping clones. Sequencing of the 

30 clones revealed a putative fimbrial operon designated tcf for Typhi Colonizing 
Factor. Four ORFs, tcfA,B,C t D, have been identified with putative proteins 
having significant homology to CooB (38% identical over 192 aa), CooA (37% 
identical over 170 aa), CooC (34% identical over 872 aa) and CooD (31% 
identical over 272 aa), respectively. The Coo proteins are involved in the 

35 biosynthesis of the CS1 colonizing factor antigens of enterotoxigenic E.coli (Fig. 
4) (Froehlich et al., 1994). The peptide of the tcfB ORF is also homologous to 
the CblA major fimbrial subunit protein (45% identical over 154 aa) of the cable 
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type II pili of the cystic fibrosis-associated Burkholderia cepacia(Sajjan et al., 
1995). Down-stream of the tc/-operon two ORFs were identified with the same 
transcriptional orientation as the tcf genes. The first was designated tinR for 
Typhi insert regulator because it is homologous (33% identical over 144 aa) to 
5 AzlB of Bacillus subtilis, a member of the Lrp/ AsnC family of transcriptional 
regulators (Belitsky et al., 1997). tinR is followed by an ORF [tioA for Typhi 
insert orf) encoding a putative protein of 205 amino acids with no significant 
homologies to anything in the DDBJ / EMBL/ GenBank databases. The above 
sequence from Salmonella enterica serovar Typhi strain RKS 3333 and the tcf 
10 region of the incomplete genome sequence from serovar Typhi strain CT18 ( 
http: / / www.sanger.ac.uk) are 99% identical over the total length of the 
investigated region in concordance with the clonal nature of the serovar . 

A 2 kb large internal BcoR I fragment was used as a probe in a Southern blot of 
15 the SARC collection. This blot shows that Salmonella enterica subspecies I 
serovar Typhi (SARC2) is the only strain in the collection possessing DNA 
hybridizing to this fragment (Fig. 4). 



% 
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SEQUENCE LISTING NO. 1 

<110> Folkesson, Anders 

<120> The complete sequence of the cs7 insert in Salmonella 
enteric serovar Typhimurium 

<13 0> Complete sequence of the cs7 insert 

<140> 
<141> 

<160> 5 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 46870 
<212> DNA 

<213> Salmonella typhimurium 

<220> 
<221> CDS 

<222> (37368) . . (37868) 




<220> 
<221> CDS 

<222> (37952) .. (38689) 

<22 3> safB putative periplasmic chaperone 

<220> 
<221> CDS 

<222> (38713) . . (41223) 

<223> safC putative outer membrane usher 

<220> 
<221> CDS 

<222> (41245) (41715) 

<223> safD putative fimbrial subunit 

<400> 1 



gatacaaatc 


tcagggtgtt 


tttatacatc 


ctgtgaagta 


aaaaaaaccg 


tatcactgta 


60 


aaagggatac 


ggtttttttt 


cgtcttcaag 


aagttccacc 


gtctatcgtg 


gaatctggcg 


120 


caaatgggee 


tacgcctgga 


tgacgaacag 


gatattaccg 


ccacttcttt 


cactgtcatg 


180 


gctattttga 


tcccactgac 


atttaaggcg 


cggcctcatg 


gcggtgctta 


acegggateg 


240 


ggacatgttc 


agegcagaag 


cagactgegt 


aatgttgata 


tcactcagat 


aattaeggag 


300 


aaccgccaga 


catgcgcatc 


atcactccag 


ggcatcccac 


ttctccagca 


actccaccgg 


360 


gatctcattg 


atcacctccg 


agaaccgttt 


tcccaccagt 


ctttcagcct 


ggcgtaacag 


420 


tgggatggtc 


gggctactgg 


gttcactget 


ctcaaaccag 


egacgaatge 


accgcaggcg 


480 


ttccagcgca 


tcattacgat 


cgcgaattgg 


ccccggttca 


gcattgtgtg 


ggagggatat 


540 


accgggtggc 


gcgttgacag 


gcggcacatc 


ctctgcccgt 


geeggaaegg 


cattatccat 


600 




aaccgtgcct 


gcggggtctg 


gcactggcgc 


tggcgacgga 


gggatttccg 


gtgtggttgt 


660 


ctgtaccacg 


tcaggcagca 


gcgctaacaa 


ctgccgcagg 


cgtgaaaaat 


ctggagccag 


720 


atcgcctaat 


gtctcacgcg 


cccacacttg 


caggcgttct 


gcgctttcct 


gcgcttcgcg 


7S0 


aaacgccgcc 


agcggtaatg 


ccccacgggc 


ttcgaggtcg 


gccagttgct 


ggcggacgga 


840 


ttcgggggcc 


agcgcatcag 


ccgggcgggg 


agcggataac 


gcgcgctcca 


catcccgtac 


900 


ctgtaggcgc 


agggcggcac 


tgtttgacag 


cgtaataccg 


cgaatgtccg 


ccatcacgcc 


960 


ttcgtgatcc 


agcagcgccg 


ccagggcatt 


actgcgcgcc 


agcgacgcat 


cttcaataga 


1020 


ttctgcggat 


gcgccttcgc 


cggtaagcag 


ctgtggatgg 


agtgcatcag 


accagattac 


1080 


gctcagttct 


gccagttgtg 


tgagcatttc 


cgccagtccc 


tgtgcgccag 


cctgctggat 


1140 


gcgactgcgc 


agcagcagga 


tcaacacccg 


gatatcctta 


ctgcgcgtga 


gcagacggcg 


1200 


cgcatcacgt 


tcaatttccg 


gccagttcac 


ggcttccggc 


gtactgacaa 


aatcaccata 


1260 


ttgcgcctcc 


gcctgggggg 


cggcgcggct 


gaacagcagc 


aggtattccg 


gatcgtactc 


1320 


cggatcgggg 


ccgcagggtt 


gttccgcgct 


taccggttta 


gtcagggaca 


tgtccatatc 


13 80 


attactctca 


gtgggttaag 


ccgtgttcag 


gttcaaaaat 


catgccggtc 


accggacggt 


1440 


catgcatggc 


tttacctgcc 


caggccgtcc 


agccaagccg 


ctgcgcgccg 


ccgatcaccg 


1500 


caggcggtgc 


ggcatgaggc 


ttcaactgca 


gctcaatttc 


ccagcaatat 


tcaaaaccga 


1560 


tgaacgtacg 


caccagttct 


gtcagcacgg 


gcagattgtt 


gccgccgggc 


agaaaacgca 


1620 


ggtaatcctc 


cagcgtgagc 


gggccgataa 


ttagccggaa 


tttgtactgc 


atatccggta 


1680 


cggcctggcc 


gataagcgcc 


ccgttgccca 


gcaccgaaga 


ctgcgcgggc 


gtgcccagac 


1740 


gggtgatttc 


atcgttcgcc 


accgttatcc 


agtgcagggc 


gaattctttc 


accgcaaaag 


1800 


gtacgctgaa 


atagtgcgcc 


agcgtggcgg 


ccagcccgtc 


aggattgcgc 


gattcgcgta 


1S60 


ccagatgggc 


ggaggctgcc 


aggcgaacat 


gatctgacag 


cgggctttcg 


gcgctttccc 


1920 


gtagatcctg 


cccgctgaga 


ctggcgatat 


aaaacgcaaa 


acggtcgtgt 


tccggtttgt 


1980 


ccagcccgcc 


accagcggac 


tgggcgctgc 


gccatgcctg 


ccagaactgc 


gtcagccagc 


2040 


ggtggtgaaa 


aatattggaa 


aaatgaacca 


gcgtcggatc 


gtgacgactc 


tctgagcggt 


2100 


tcagtgccag 


ctcggtatag 


tgcagcggca 


acgggccgtt 


tggcccccat 


agtccgaggc 


2160 


tgtacaggct 


caggtgcagg 


cgtccatcct 


gccagctgac 


ctgggcgatt 


tcccgtggcg 


2220 


caaaggtcat 


cgtcggcgtc 


tgtcccagac 


ggaatttttc 


catccgtggc 


ttgccagata 


2280 


cttgcctgcc 


ggagtatcac 


agagctgggc 


atccacgcgc 


cgcatcaggt 


tcaggaatcc 


2340 


gtaacgccag 


ggggttttct 


gcgcctggtt 


gagagcgccg 


gtcatacgct 


cccccgttgc 


2400 




ccggttctga 


ccggccaggt 


catgacatgc 


ccgcgttgca 


tcgagtgcag 


cgtcatctgc 


2460 


gagaaggtat 


taatggaaac 


atggcgggca 


atatagtgtt 


ccagcaccag 


accgaacagg 


2 520 


^ i a n ^ « ^ 

CaciggacLga 


U dCCg^adaa 


*- — <• » 4- ^ ^ s"*fT 

tcctuCLLCy 


^ j"* i ^ i**f ^ ^ a 

LCCaCygi'va 




yacyccccgy 


^ ? D u 


ccacagacca 


s 4 ir*t ^¥ a 

acaggccgga 


f»"f «^ m 0m<*m\0m fm *m 

gccgggcagg 


cggcgggtca 


ccggggcggc 




^ O^t u 


atcaggctgc 


gcacctggcg 


cgactgcggg 


ctgtcgtgag 


ccgggataaa 


cagau tcagc 


z /uo 


agatcgcgca 


gcgcctggcc 


gccggtgcgg 


tga tccagat 


cggccagcgg 


cagataatta 


2 76 0 


aacgacaact 


gccggatcag 


ccgccaggcc 


atttcgcgtt 


cagccagcgg 


cggctgcggc 


^ A O 

2 820 


gggcgcggcg 


gtctgataag 


acccacgccc 


gccaccggaa 


tcgccgcatc 


tacggtcaga 


O ft 


tcatcccggc 


cattacgtgg 


aataaggcag 


ggcagatcgc 


ggttagtcac 


cattgccgtg 


2940 


acgg tgat at 


ggcgcagatt 


ttccgggtag 


ggcgcttcat 


gctgat caac 


cagcgagagg 


JUUU 


aagacttccg 


agccggtata 


99999 ttc 99 


gfcgccatagc 


ggcgggcgtt 


tcctgacgag 


•a c 


cggcgcggtt 


cacgacgcag 


tgaaaaataa 


cgcccgtggt 


mm ^mt -rm m— ^— mm mm- mm 

tgccttcgtc 


at tat t acgg 


3 120 
















tcctgaacag 


aaaacacctc 


gtaatccagc 


ggacgggtac 


gatccaccac 


cagatgctgt 


3240 


tccgtcacgc 


tgtgagtgac 


ttcaatccgg 


gtggtggtgc 


gaggaagcag 


gttgatcacc 


3300 


ggcgtacaga 


acaggctgaa 


ctgtgcagcg 


tccgtctgat 


gaatcagcca 


gtccggcggc 


3360 


aggcggttaa 


gcagtatgac 


aatttccgcc 


acattgccct 


gcaccttttg 


taacccggca 


3420 


gacaatccgg 


tcggggtgaa 


gaagtaaaac 


cgttccggac 


aggcgaaaaa 


ttcatgcagc 


3480 


agattatggc 


cgtgaaacae 


gttccaggcg 


agcggtagca 


gcccctgccc 


t ggc t cc ag c 


3540 


ccttcgtgcg 


ccaccgggtg 


tfcgaagattc 


acattcagtt 


cgccgtcaaa 


gtgaccgggt 


3600 


tcaccggcca 


gtgtggcgac 


ggcgctggta 


tgtagcagct 


caaacaggtg 


tgacgcaatg 


*i ^ ^ ft 
3660 


cgttcttcgc 


cgcagaggta 


aaagggcagc 


cgtgccggac 


cggccagctc 


gctgaaagtc 


3 720 


agctccccga 


aggttcgcag 


ggtgatgcgc 


aatgccccgg 


cgacatgaat 


attaggcggc 


^ ft ft 
37BO 


agatagcggt 


gcagggcggg 


catatccggc 


ggcgcggcgg 


tcaggcgtac 


ctcctcgatg 


3 84 0 


gacagcggcc 


acagcgtgac 


gtcctggctg 


ctgcgaaact 


ggcaggcggt 


attttcgcct 


3900 


tccgggatgg 


gggaaacgaa 


cgcggtatcg 


cgcggcacgg 


tgaccccttt 


cgccaggtcg 


3960 


ccttcctgcg 


tatcgggata 


cagctttacc 


actgccatcg 


atggcgtggg 


gg t ga eg t aa 


4020 


ttggggctga 


cgacttccag 


taaccgctgt 


gtgaagcggg 


gaaactcggc 


gtcaattttt 


4080 


agctgagtgc 


gggcgctcag 


aaagctgaac 


gcctcgatca 


tgcgttccac 


ataegggteg 


4140 


gcaatatcgg 


ttccctgcat 


ccccagtcgg 


gcggcaattt 


tgggatggag 


ggtggcgaac 


4200 


tcagcaccgg 


tctcccgcag 


gtagctcagt 


tcgcggttgt 


aatactccag 


tagccgtgga 


4260 



tccatgaata 


atgccctgta 


ttaaaagaac 


gtaatgcggc 


tcagctccat 


atccagcgcg 


4320 


ctgcgtacca 


gaaactccgt 


gggatacggc 


tgcgtcagaa 


tttgtccgcg 


aatttcaaac 


4380 


tgtagcgtgt 


tatagctacc 


ctgccgggtt 


ttatcgagca 


agggcgtgac 


cctgagtgtg 


4440 


gcggcgttca 


gccggggttc 


gaaacggata 


atggcgcgcc 


ggatcgcctc 


gctgatatcg 


4500 


tcccacttat 


gctcatacat 


aaagctgccc 


gccagcggcg 


gcaggccata 


gttgagcact 


4S60 


gacgccgccg 


cctgcggata 


gcgccgggcg 


tcgatgtcac 


cctcgtggct 


aatggtattg 


4620 


agcaaaaagg 


agagatcccg 


gcgaatgatc 


tccttcagtt 


gtaccggcgt 


gacgctgata 


4680 


tcccggtcaa 


ttttctgata 


cggagcattg 


tcacacagcc 


gatcaaacag 


cgtgggcagc 


4740 


aggtggttag 


cgggtgtaaa 


acgagacgtg 


ctcatgcgcc 


atcgttttcc 


tgagcatgaa 


4800 


aggtacaatg 


ggccatgtcc 


agcaggctga 


tatcgccgtg 


gctggtcagc 


cacactttct 


4860 


gccccagcgc 


ccgcacggtg 


gtttcgccgg 


ggccgtcctg 


ccaggcggtt 


tccctgcaca 


4920 


gacgcagggc 


gtcggatgca 


ctttccgaac 


cgctgtaacg 


ggtaaagagc 


caggcgccgt 


4980 


gcgtatcgcc 


attcaccagg 


gtgatattaa 


cgggtttcca 


cagcagatca 


gtcaggcgcg 


5040 


tcggttgcgg 


cgattccagc 


gagcgtattt 


gcgaaaacgg 


cagccagata 


tacacgccgc 


5100 


cggtgactag 


ctcaagtacc 


gggccaaggc 


gggaatcgct 


gtcgctcgcc 


cagtcaaatg 


5160 


cgccgccgtt 


ccactgcccg 


cccgtgtctg 


ttatggcttc 


cagtgcggta 


ttacggtgtt 


5220 


tatcaacctc 


accggtatcg 


tcatgacagg 


cgagtgccgc 


cagcagtgac 


tccacccaaa 


5280 


c 999 ct 9 c 99 


cagaagaaaa 


ccgggtcgtt 


gttcaccctg 


aaaaacggtg 


tggcggaaca 


5340 


tttcgcagcg 


aaccagctcc 


cggtacagcc 


gggcctcctg 


ggtataattg 


gcctccatcc 


5400 


tggcgcatag 


ctgaagctgg 


tgtagcgccc 


gcgaccagtc 


tccggccaca 


cacagcaact 


5460 


gaaacaggct 


gtggcggcag 


agcgctttcg 


ccggattttc 


ccgaacctgc 


tgctccgcca 


5S20 


tctgaatccc 


ctccgcaata 


gagtattcct 


gtatcagcgc 


ggacagggta 


gcaggaagcg 


5580 


tgtcagtttt 


tttcatgggc 


ggtatttcca 


tttttctgtg 


tcggagtgat 


tcggtagtgg 


5640 


ctatcgatgc 


caataatacg 


atgctcgcgg 


cgggtcagat 


cgggcagtat 


ctegctatgt 


5700 


gcggttttcc 


cccccagttc 


cggcgagagc 


agttgcagga 


tatcgggcat 


tgattccatc 


5760 


gccagccagt 


gcatttcacc 


ctcaccggtc 


gtgtccagcg 


tatccagaat 


ggcatcaata 


5820 


ccgggcgcgc 


ctgccaccat 


atcctgtaac 


gtgtcggtgt 


cgcccttttt 


atcgtataac 


5880 


gacgtcaaat 


cctgggtgac 


agcgtcttca 


tttcggggaa 


acggtttagc 


acggaggggc 


5940 


ttttcctgcc 


cggagggccg 


caatgcctgc 


tgatattcct 


ggtaaagctg 


gtgtagtgtg 


6000 


ctatccgctt 


catggccagt 


gtaagaggat 


gcggtttccc 


ggacgggaat 


aatatcgaac 


6060 



% 




>• 



18 



ftfrfi t- 4~ t'nr'rt 

gg g cue L.y <_y 


y = w ayv. uy 


ttattattta 

W w W W W0 


aaccaatcca 

a u w w ^y W W W 


aatccaacrta 


ctcagcaacg 


6120 


ydLLCay aty 


a*"*t"f*iaafr"'aa 


L*Lyayyaacic 


yaaa^»uV'yy 


aacocraactc 


at cat taata 


6180 


r'rirtrtr'a a ftfif* 


d W WGly MO wy 


eaaaceecat 

wnyn w w w w n w 


tea a teat at 


ctcca tea t t 


cagccgcata 


€240 


Lyy Lyd Lyay 


y w ww.wdi*w.aw; 


rtoetcatto 




acaattcatc 


aoaataatta 


6300 


^ a t~ a. 


ar , fTf , f~t"r , t" t~ 

d WU W W WWW WW/ 


ofcaocaaaca 

w^^mj w^^cm w ^ 




coat at cat c 


tttcccatgt 


6360 




y *->yy dd W y ^ d 


yyyyy^-yy-y 


araotaaata 


caatatatdt 


www w^w^j^j^j*** 


6420 


tachcftacot 


tatccottat 


accctacaac 


tgtattttcc 


gtaattccca 


catgtcttgt 


6480 


L U^aLLa Ly 1^ 


L>^>^ t L add U y 


fafttatttt 

w ct w w w n w w w w 


ww^ wyyoyyou 


acat 1 1 acroo 


agttttaatt 


6540 




4— ^ ^ #-i 

LLtaaCUaaa 


1. 1. LaLdy^y a 


yuw>ctw wai> wy 




ttttattatt 

www wn w w** w w 


6600 


ttcgtcgtga 


atgcattgtt 


gtatgeatag 


atgtcttttt 


tgaaatatta 


ccccctccaa 


obcu 


4— 4* m ^ ^ 

tcctgcaact 


gcgacwCt.ua 


w C w w g C w g t w 


LotLyaL wet w 


t^woa L. cl w wad 


w y y w w way w w 


6720 


+ 4"" ^ ^ 4- 3 

gCtddLLdyL 


» « J** ^ ♦— /-^ * 

ccccyatcty 




w wy w w L. wy d w 


atttLL w wye* 


aactatcaat 

y^j w w n w w w 


6780 




tt£tSt£ttt 


statgeattg 


atgeattatt 


tttatoaatt 


ttt?!tgfccac 


6840 


dayycatoau 


dwet uyyaaeiL. 


LUuLy WW^CIi 


i>y*»o.y wy v>y w 


tofcat caaaa 

W>J W W *J **^y»** 


actaaccQOC 


6900 






d u w-y y w,d*»w»y 


ywawd w wyw>a 


aactacactc 

UdW WM^*^JW WW 


taatccctaa 


6960 


gctgagctga 


ctcactyyct 


^ a ^ ^ s ^ ^ a 
yvoCCdyi, wa 


ct w ct w ay w. cty w- 




tattctccac 

w a i« w w w w w w 


7020 


gt ccuccggc 


a h a f a a ^™ 

accaccayaL 


^ #-« #— • fr- fr- *^ 

CCCUCLLLCt 


r , ra H f"nt~rf < trta*Ta 
y ci L.y u y y cty ci 


ctciy t>y w t- cx w w 


w-^yy wadw> wy 


7080 




ccg cc 9999 c 




aH ^ ^ ^ 


rtraccatat 

W> W W>H W W Q W« W 


caatctcaac 


7140 


rrl~ t~ **t a a a afi a'i 
y L- Lyaaaayy 


w. w tyya Ly w. w 


yy y y y 




ataacaaaat 


tcacaacaac 


7200 




fooccttatt 

wy y w w w wy w w 


Q V# CI ^4 k» ^4 


aa act create 


oaatactaaa 

yyy ^-y**3 


cagtatctgc 


7260 


gcyccgci-gg 


r*f a tariff" t" <^f* 

v» w»w.y W> W W W> w. 


t"l~f5;4faaa 
yy i»tyawyao 


d t rra eooaaa 


tactaccctc 


attaatcaaa 


7320 


acaccgccgg 


aagcgcaggci 


yLyLUL w w ct w 


a ft fir* ^ f f* ft 


y w u i>y y i*ct u^ 


aoccattccc 

Ot^yft W- W* W W V W 


7380 


ggtgaaagca 


g ucaggega u 


Lcccaacyyc 


gggcaggacg 


^ a a a f* f*ft f* 
y Lada LCwy 


y w- l. y y w ci ci ci ex 


7440 


tac tgtcagg 


acatgaegge 


acaggcgcgc 


S ^ f"T ^ S 9 Q 9 

yoCyyccidda 


vCyaUwwyy u 


gaeggggt-gt 


7500 


ydy C a C y clcLcl 


tccgcaccoL 


yaCyydLal. w 


f* ^ ft f* t - f*fff f* 

wy^> uyLyLC 


y L^yu^dydd 


W^yl^l w w W^B 


7560 


CuyaCLggcg 


aggegggegt 




y^yy i#\-y u- y 


aaaattttoc 


cctcacaatt 


7620 


gcgcaggggg 


o ci y ik* y u. y 


w.y ^yL. Ly i»yy 




^» » V— i*yywyw w 


aaacattaac 
yy oi *y w ^yy^> 


7680 


gctctgttgg 


ccggagccag 


catgaaaggc 


gagtttgaat 


cgcgtctgaa 


agggttactg 


7740 


gaagaggccg 


ggcgctcgcc 


geagceggtt 


attctgttcg 


ttgatgaagt 


tcacactctg 


7800 


gtgggcgcgg 


gcggcgcatc 


cggcacgggc 


gatgeegcta 


acctgetgaa 


accggcgctg 


7860 


gcgcgcggca 


ccctgcggac 


tatcggcgcc 


accacctgga 


gcgaatacaa 


gegecatatt 


7920 




gagaaagatc 


cggcgctgac 


ccgtcgtttt 


caggtgttgc 


agattgccga 


accggaagag 


7980 


atccccgcaa 


tggaaatggt 


gcgtggtctg 


gtggatacgc 


tggaaaaaca 


ccataacgta 


8040 


ctgattctgg 


atgaggcggt 


acgtgcggcg 


gtacagcttt 


ctcaccgcta 


cattcccgcc 


8100 


cggcagttgc 


cggataaggc 


catcagcctg 


ctggataccg 


ccgcggcccg 


cgtggcgctg 


8160 


acgctgcaca 


cgccgcctgc 


cagcgtacag 


ttcctgcgcc 


agcagctaaa 


agcggcggaa 


8220 


atggaacggt 


cgctgttgca 


gaagcaggaa 


aaaatgggga 


ttcagtcaga 


tgagcggcgc 


8280 


gatgcgctga 


tggcgcgaat 


tttctcgctc 


aacaatgaac 


tgactgcatc 


cgaatcccgc 


8340 


tggcagcggg 


agctggaact 


ggtacatacg 


ttgcaggaac 


tgcgtctcgc 


agagtctgat 


8400 


gctgatgaca 


aaaccacgct 


gcaacaggcc 


gaaacggcgc 


taagggagtg 


gcagggcgac 


8460 


gcgccggtgg 


tgttcccgga 


agtcagcgcg 


gcggttgtcg 


cggcgattgt 


cgccgactgg 


8520 


accggtattc 


ctgctgggcg 


catggtgaaa 


gatgaggcca 


gccaggtgct 


ggaactgcct 


8580 


gcccgactgg 


cgcaacgcgt 


taccgggcaa 


gacggcgcgc 


tggcgcagat 


tggtgaacgt 


8640 


attcagaccg 


ccagggcggg 


actgggcgat 


ccacgcaaac 


cggtgggcgt 


gtttatgctg 


8700 


gccgggccgt 


ccggtgtcgg 


taaaaccgaa 


accgcgctgg 


cgctggcgga 


ggctatctac 


8760 


ggcggcgagc 


agaacctggt 


aaccatcaat 


atgagcgagt 


tccaggaggc 


tcacaccgtt 


8620 


tccacgctga 


aaggcgcgcc 


gcccggctat 


gtgggctatg 


gcgagggtgg 


tgtgctgacg 


8880 


gaagcggtgc 


gtcgccaccc 


ctggagcgta 


gtgctgctcg 


acgagatcga 


aaaagcgcac 


8940 


catgacgtcc 


acgaactctt 


ctatcaggtg 


tttgacaagg 


gtgggatgga 


ggacggcgag 


9000 


ggtacacatg 


tcgatttcaa 


aaacaccacg 


ctattactca 


ccaccaacgt 


gggttccgac 


9060 


ctcatcagcc 


agatgtgtga 


agatccggcc 


ttaatgcccg 


atgctacggg 


gcttaaagag 


9120 


gcgctaatgc 


cggaattgcg 


caagcatttc 


ccggcggcat 


ttctgggccg 


cgtgacggtg 


9180 


atcccttacc 


tgccgctgga 


cgaaacgtcg 


cgtggcgtga 


ttgcccgtct 


gcaccttgac 


9240 


cggctggtgg 


cgcggatgag 


tgaacagcac 


ggcgtgacgc 


tgacgtatag 


cgaggaactg 


9300 


gtcgcacata 


ttgtggcgtg 


ctgtccaatg 


catgaaacgg 


gcgcgcggtt 


gctgattggc 


9360 


tacatcgaac 


agcacattct 


gccacgactg 


tcgcgctact 


ggttgcaggc 


catgacggaa 


9420 


aaagccgcga 


tcaggcagat 


tgatatcggc 


gttaatggtg 


atgagcagat 


tgtttttgag 


9480 


atcgtttgct 


gaaaccggcc 


gttcgaagtg 


tccgtagtgc 


gattttaaaa 


actgtaccgg 


9540 


tataccgctc 


cccttgcggc 


aaccagttga 


ctaaaaagga 


aatgaaggat 


tatggctatc 


9600 


aacaatagcg 


cgcagaaatt 


catcgcgcgc 


aaccgcgcgc 


cgcgcgtgca 


gattgaatat 


9660 


gacgtagaga 


tttacggttc 


cgagaaaaaa 


atcgagctgc 


cgttcgtgat 


ggcggtgctg 


9720 



% 



/*■ /^i /*i a h H rt^Y 

gccgai.ct.gg 


(_C^yyycLClClL-C 


y eg tyaayaa 




LLyaLaaLL t 


Udauyay (jy v- 


ycty uy c cy « 


aLa^y ^ L- y o 


yyy^y^yy 1 * 


a t" nna r* , ***f ^ 




u \— d.y a. l* L^y w 


y aay i^. i__ *«.y t^. o 


h rtm^ 


y aa^^# Lk^c ay 




aha fl^ft'CThh 


y uayya^.^ w y 


aa a f CtCl Ct 

d d d c '-yy^^y 




dy ot y 


^+*t-^ h t*t"aa nn 


aaLLLLLaLy 


yLdaat>ay L.O 


Ct L- CtC (-.y Vh- C L- L- 




yaa Ly a i_y 


ctacccaggc 


ggcaaaagee 


gtggaagccg 


« ^ /—* ^ 
ecagegd cga 


*~r #^ ^ a h a a a 
cycwtaLdaa 


dyCatCdycy 


ns a a a fr* 

CCyaaCayat 


^ a a a i~ na. t~ ^* 

vJaaaULya Lw 


^* ^ a a ^ a t~ ^ 
CuyCadua L.<_ 








A A a i" ^* ft 

tyaclLLLy 


a a a A /"r a A 
oodaya u^cici 


tLycygcyCa 


a UCaddyCCC 


gauge tcaag 


a a a V- a ^* a 

aaaCtyLaby 


CC tacggctig 


tatcattgeg 




ccggccccac 


CyCCdodyLC 


gccgcgucgg 


« frr* 4— 


r*i #■* a a a. ^* f^Tf^i a 

ycddauyydL- 


+• m a -\ rtr^ 

lcc uygLdyy 




na arr^h nn a a 

WQ CL ^> tu> 1_ ^ a. 




yLLdLat 


LydLyd ^y 


/"•ort^rt t~ f h hr* 
t-t-yt-yuuuuc 


dCCLyyLyya 


/Tranht' hrtaf- 

uyay LLuyaL 


^ +~ na ana an 
u. u L.y day cLciy 


t c tggagcaci 


cgcggccucic 


gcgatgggcg 




y-v a ^ ^ « 

yLLyaLCvjy^ 


ggt-gLyyaat 


dLdCCCLCCC 


ydccgacgat 


ggeggegegg 


ctgaccgccg 


cgaggctgaa 


ccggcgaaaa 


actcagacta 


tgccgccctc 


accggcgcac 


auccggacyc 


gacggccaac 


gccaacccgt 


cgcgcttcgc 


tcacttcctc 


aaatgtatcg 


gtgaggatat 


geagegctgg 


ctaaatgaat 


tgaactcctc 


gcaagaaact 


aaagcccgtc 


aagaggtcga 


aggcaatcca 


ggttattacg 







} 

i. 






tqacqqatcq 


caaattcctc 


9780 


atoaaaacca 


ttacaccoca 


cqtqqcqttc 

3 l, 33 u 3 ^" 


9840 


caottcjatQQ 


tcgatatcac 


gctggaaaat 


9900 


CQCaaqQtqO 


acgccctgaa 


ccagttactg 


9960 


acctacatgg 


a t qq c aaqqc 


qqqqqcqqaa 


10O20 


actctgetga 


a a a ccc t aac 


qaatqcqccq 


X0080 


ocqqataatq 


aatcagcgga 


ataaegtega 


10140 


atatcfcacrac 


aaccgacgcg 


qttqctcaqq 


10200 


cgttgctgaa 


tcaggccttc 


cgacccaaga 


10260 


caotacaaac 


qctqqcqaac 


acgatcaccg 


10320 


ctattat tgc 


gcagatcgac 


tttaaactga 


10380 


ccgactggca 


qaaqctqqaa 

3 w ^3 33 


tcctcgtggc 


10440 


acfgccoaccra 


aaaoc toaaa 


attcacttca 


10500 


acatgaagcg 


ttacaagggc 


atcqcctqqq 


10560 


aagecgaata 


eggecagtta 


qq tqqeqaac 

3 3 33 3 


10620 


tcgaccatac 


accgcccgat 


qtqqatctgc 


10680 


cccatgcgcc 


gtttattgee 


qqqqcttccc 

3333 v ^ * 


10740 


aactggcgaa 


tccccgcgac 


ctgaccaaaa 


10800 


ggaactcget 


qcqqqctaqc 


gaagactccc 


10860 


ttgcccgcct 


qccqtatqgc 


gcaaaaacca 


10920 


atqcqqatqq 


ttctgaccat 


accaaatacg 


10980 


taaacatcaa 


ccgttccttc 


aaacactacg 


11040 


caqqcaqtqc 


qqtqqaaaat: 


cttccctgcc 


11100 


acatgaaa^g 


cccgaccgaa 


atcgccatct 


11160 


aeggttttat 


cccgttgatc 


cacegtaaaa 


11220 


aatcoctaca 


aaaaccacag 


gaatactacg 


11280 


ctacccatct 


accgtacctg 


ttcgcctgct 


11340 


tccgcgacaa 


aatcggttcc 


t.ttaaagagc 


11400 


ggattatgaa 


t tatgtcgac 


qccqatccqq 


11460 


atccQctaqc 


tgccgctgaa 


qtaqtqqtqq 

3 *33 **33 


11520 


aegegaaatt 


cttcctgcgt 


ccgcatttcc 


11580 




agcttgaagg 


gctgacggga 


tcgctgcgcc 


tggtgacaaa 


actgccgtca 


gtgaagcagg 


11640 


gcaatgcctg 


atatatattt 


tgtgaatgtt 


taagcgagtg 


aagtcagaga 


agatagagaa 


11700 


tataaagagg 


gatatgaaga 


aaagaatttc 


gtctcgccca 


cggtctcgta 


aaggtggggt 


11760 


acgtaatgat 


gacacatatc 


cgaatgccag 


taacaatgcc 


gaagcttttt 


atatcattga 


11820 


gtaggaaata 


catattatga 


ccataagccc 


aacttttcat 


ctgttacctg 


gtattgttct 


11880 


gctctcttca 


caatatgctg 


tagcctggga 


agtcagttgc 


ccggctgtta 


ttgatactca 


11940 


gtcttctgct 


gtgagcctga 


agtctgatgt 


cccagcggcg 


tggcagcttt 


ctccccgata 


12000 


tatgtcgcgt 


ttatggttaa 


gtagtattgg 


ggtaacgcag 


ggtaaacctg 


aaaacctgat 


12060 


ggatctcaaa 


ccagagacta 


aaaaagtaaa 


cggtgaaaat 


tggtctgtat 


gggaaacaga 


12120 


acgtggtagc 


gataaagaaa 


ccgatcgcta 


ttgggtttcg 


tgtatttatg 


gtcatgaaca 


12180 


gatatggttg 


acgcaaccaa 


tacctgcttc 


ttctactcgc 


tgtaagactc 


gtaattttga 


12240 


gggatcgcca 


gaagaccagt 


ctgtatcttt 


tatctgtaat 


tagcgatttg 


agacgtgaaa 


12300 


atttcagtac 


aggttatggt 


ttttattatc 


ggaagttatg 


aagcattatt 


tatatgcatt 


12360 


aaataatgca 


aattcataaa 


ataactaaat 


acattatcgg 


taccggaaaa 


atatacagtc 


12420 


ctctgttctc 


ctgaagttat 


tggagaagga 


ttctgtacgg 


caatgattta 


tctataaaca 


12480 


aaaagatata 


gataaaatca 


ggtttatttt 


aagtaaaact 


taataaggat 


ataaaaatgg 


12540 


cttatgacat 


ttttttgaaa 


attgacggca 


ttgatggcga 


gtcaatggat 


gacaaacaca 


12600 


aaaatgaaat 


tgaagtactg 


agctggcgct 


ggaatattca 


tcaggaatcc 


accatgcacg 


12660 


ccggtagcgg 


cctcggctcc 


ggtaaggtct 


ccgtcaccaa 


cctggatttt 


gatcactata 


12720 


tcgaccgcgc 


cagcccgaac 


ctgttcaaat 


actgcgcctc 


cggcaagcac 


attccgcagg 


12780 


ccattctggt 


tatgcgtaag 


gctggcggca 


atccgctgga 


gtacctcaag 


tataccttca 


12840 


ccgacctgat 


tgtcgccgtg 


gtttccccga 


gcggcagcca 


cgatggtgaa 


atcgcctccc 


12900 


gtgaaacggt 


ggagctctcc 


ttcagcaccg 


tgaagcagga 


atacgtggtg 


cagaaccagc 


12960 


agggcggcag 


cggcggcacc 


atcaccgcag 


gctacgactt 


caaggccaac 


aaagaaattt 


13020 


aacggctgtt 


tttccggcca 


gatgttatgt 


ctggctggtt 


ttattgtttt 


gattttaaag 


13080 


gaatttacag 


tgaataaatg 


gcgtaacccc 


actgggtggt 


tatgtgcggt 


agctatgcct 


13140 


tttgcactgc 


tcctgctttc 


cggatgcggc 


agtagcgatt 


cgctacttga 


cccctaatcg 


13200 


cagcggcctg 


gcctgagcgt 


gaaagcgttt 


tacaaggtga 


attctgacaa 


tcagaagaaa 


13260 


gcggcgtcca 


tgaagatacg 


tgttgagaat 


taatgaccta 


cacagaattc 


ttagaggtta 


13320 


agcaaaatga 


acagaccttc 


attcaatgaa 


gcgtggttag 


cttttaggaa 


ggtgaatcat 


13380 




J>*^ "fr - rf"* J™* ^ >^ 

cccgccgctg 


aegegggcag 


CaLCactyy t 


gg aaacg ctg 


-i ^ 3 a ^ 
yyaoaaatat 




1 "KAACl 

X J t V 


cane cues 


atgcccgccc 


cacucgaacg 


^ a ^ f ^ 


egaaugegae 


agggt tccca 


J. J 3U V 






tyLadayyt t. 








13560 


eg eg Cgda tg 


^ ^ ^ +- 3 V- ♦- *-^r -\ 




CaLciutat-yg 


/— f o 3S 53 rfT/""* 1" t~ 3 

^^aayc Ly a 




13620 

X <P v v 


^ a f- o i t~ rrio 
da l_cldl~lJl~^Gl 


cLaLay ay i.y a. 


t>tL vat ^333 


adgaaaygda 


af - j-«/-t f- art t" 
t. LaLvtj Loy t 




13680 


ggc tygagca 


avyCCayayg 


-3 /*■ ^ i— ' i~ 4- ^ y-* — i 

aCaCyttaCa 


^ ^ a +~ rrrr ^ 
LLacggaaLg 


^Cdtj Lo Ly 






f- sT A ^ *-i *^ 4— = 




aycll-clGll~yycl 




ccgaagttgg 




13800 


a *- ±* A ^-i y"t fc* 

acaccgccgt 


gaaaeggeca 


a ^ 9 H t* flrtf- 4- a 

acaccagcca 


^ ^ ^ ^ a a 4*> 

CCtCCagCat 


aaguggaaug 


Cuygta tyy c 


X J o o w 


agccatcctt 


tgcacaagaa 


gcaccgacca 


cacaataccc 


acagueggaa 




J. J J* u 


attgggeget 


gagtcattgt 


ctggcattag 


tatacaaaga 


tgatgtcgtt 


aaaaacgatg 


13980 


ccagagctac 


ggccagtgct 


caccucgaac 


atyytaaaca 


aCCuytggag 


s ^ ^ s n n 
aCtvoCCdty 




aaat t gat ga 


gattgegaaa 


aaauacccag 


ggt cgaaata 


Lactcgguccg 


aid (- t-d i-*-ety 


■X X v \J 








^ J3 t - j^y ^) JlfT 


(-jgp 9) ^ ^ 3j J> ^ 




1 4 160 


aaaggcgtgt 


cgagaagtaa 


n n I— b- « h a a 

aacccaaaga 


cactaaaagc 


9: ^ ^ ^ #T f~ ^ ^ ^ ^ 

aLdCgtCCtC 


fr- V rTr~* t~ ^ ^ t" 
L. LyL LbLytt L. 


X it ^ A u 


gaat ttatgg 


gtaagaaaga 


gctgtacagg 


aatagttaat 


<w ^ -A— a 

ccgtLcacct 


a a ^ a ^ a ^ 

aacaaagcag 




ataaatcagg 


gcttaattta 


ggtagt taaa 


aggatagtag 


acacgcccca 


cgacattLtt 




ctgaaaattg 


aeggcattga 


eggegagtea 


atggatgaca 


aacacaaaaa 


cgaaattyaa 


x^ 4UU 


gtactgagct 


99 c 9 c tggaa 






tgcacgccgg 




X ^1 Q V 


ggt teeggta 


aggtctccgt 


& /■» 4~ a a ^ 

CaCLaatCLt 








X ^ 4& v 


ccgaacctgt 


tcaaatactg 


ctcttccggt 


aagcacaccc 


cgcaggccac 


fr- ^ rtrt ht"3^n 

t(>tyy l. Lctuy 


X V 3 o u 


cgtaaggctg 


gcggcaatcc 


gctggagtac 


ctcaagtaca 


ctttcacaga 


tCLyatCaCt 




gcaatggtat 


cgcccagcgg 


aagccaggga 


ggggaaat tg 


cgtctcgcga 


accaaEcgaa 


i a i ft n 


ctctccttca 


gcaccgtgaa 


gcaggaatac 


gtggtgcaga 


accagcaggg 


cggcagcggc 


i iTcn 

19 / 6U 


ggcaccatca 


ccgcaggcta 


cgacttcaag 


gecaacaaag 


n n «— w ^— ^ 3 




X^ft O A u 


eggecagatt 


tatatctggc 


eggae ccatc 


attttgatct 


f-4aa ^rrt a 3 "f- 
CaaaggddL k# 


f* o anffra 
L.ctLcLy tyadv 


X^l O O \J 


gaatggcgta 


accccactcg 


M ,M b 3 ft^* 

gcggccacgt 


geggcage ta 


cgccctLcgc 


actycttd u.y 




etttceggat 


geggcagtag 


egatgegtta 


cctgacctcg 


aaccacagcg 


actcgacccg 




agcgtgaaag 


cctccgataa 


ggtgaatcct 


gacaatcaga 


agaaggcege 


geccattgag 


15060 


atacgtgttt 


atgaactgaa 


aaatgacgee 


gctttcacga 


cagctgatta 


ctggtcgctc 


15120 


catgacaacg 


acaaatccgt 


ccttaccgac 


gatttagtgc 


gtcgcgacag 


ctttattttg 


15180 


cgtcccggcg 


aagagaaaaa 


actgcgtcgc 


ccgctgaatg 


cgcagaccac 


ggcaategge 


15240 




gtactggccg 


gataccgtaa 


cctggccaaa 


tcggtctggc 


gggtaaccta 


caaaatcccg 


15300 


gaagccccgg 


aaaaagcctg 


gtacagcagc 


ttcatatcgg 


ggaaaggaaa 


agtgcagttg 


15360 


gaggcggaac 


tggaacaaag 


cgccattgta 


attacggaac 


gggataaatg 


aattatgagc 


15420 


tggaatgacc 


gcgtagtctg 


gagtgaagga 


caatttttac 


tgccgcagat 


gtttcagcag 


15480 


caagagcgtt 


atctggaaca 


cgtcatgcat 


taccgcagcc 


tgccgctgac 


cccctttttc 


15540 


tggggattca 


gccactacaa 


tattgatggc 


gaagcgctga 


acatcggtaa 


actgatactg 


15600 


aaagaggcat 


cagggatttt 


tcctgacggc 


acgccgttta 


acgcaccgga 


ccacaccccg 


15660 


ctgccgccgc 


cactgaccat 


tctgccggag 


cacctgaacc 


agcagatttg 


tctggcggta 


15720 


ccggtacgcg 


cgccgaacag 


cgaagaaacc 


acgtttgaca 


ataacccgga 


atcattggcg 


15780 


cgtttctcgg 


tacatgaaca 


cgacatccgc 


gacgccaact 


cgctgggacg 


tggcgcgcag 


15840 


ttattacagc 


tcagtcatct 


gcgcctgcgg 


ctgctgccgg 


aaaaggcggt 


gacgggcgcc 


15900 


tggattggcc 


tgccgttgac 


ccgcatcacc 


gggttgaacc 


ctgacgggcg 


gatagatatc 


15960 


gaccacgacc 


tgatcccgcc 


catcattaat 


tatcaggcca 


gttcactgat 


gtgtacctgg 


16020 


ctgtcgtgga 


tcaacgatct 


catccggatg 


cgggccgatt 


cgctggcgga 


acggctgacc 


16080 


ggcagcgaca 


accacggcca 


tgaagcagcg gaggtctccg 


attacctgct 


gctgcaaatt 


16140 


ctcaatcgct 


ttgagccgct 


gctgactcac 


ctggcgaaaa 


ccccgctggc 


cccggaggtg 


16200 


ctgtaccgct 


acctgtccga 


actggccggg 


gaactctcca 


cctatgtgcg 


tccacaaacg 


16260 


cgacggcccg 


ctgaatacaa 


agagtacaaa 


cacctgacgc 


cctatgccgg 


gttgaaatcg 


16320 


ctggttgatg 


aggtgcagtt 


cctgctgaac 


gcggtactga 


tccggggcgc 


gcagcgcatc 


16380 


gagctgaaag 


aggggactta 


cggcatcctg 


aatgcggtgg 


tggccccttc 


cgatcttgcc 


16440 


gatttcagca 


cgctggtact 


ggcgataaag 


gcttcaatgc 


cgaccgatgt 


gctactgcaa 


16500 


cattttgccg 


cccagaccaa 


aatcgggcca 


tccgatcgcc 


tgccggaact 


gatccgctcg 


16560 


catctgccgg 


ggctggcttt 


gcaggttctg 


cctgtaccac 


cgcgccaaat 


cccgtttcag 


16620 


gccggataca 


tctattacga 


catccgccgc 


gagggagcat 


tgtgggaaca 


cattgcccgt 


16680 


tacggcggga 


tggccatgca 


taccgccggg 


gaatttccgg 


ggctggagac 


agaactgtgg 


16740 


ggagtgcgcg 


ataaatgaca 


gacagtaccc 


tgacgccgcc 


agcggcggat 


atgatgtcct 


16800 


ttttgtccac 


ca cgccggaa 


cataaggaca 


gtgaatatga 


aacgccggt* 


cacaccagcc 


16860 


agcgcacgga 


actcaatgtc 


atcgttgaag 


acggtccgga 


cagcaaactc 


cggctggctg 


16920 


aaatcagcgc 


ggcggctaac 


ccgttgctcg 


ccgctgcccg 


gcctttattg 


tgcgctctcg 


16980 


cagccatgcc 


cgctaaactg 


gatgcggccc 


tggtagagcc 


ttaccgtaat 


ctgctggtac 


17040 




gcgagatgca tctgtaccag acattatgcg atcaggcgaa cctgcggcgc gagcacgtac 17100 
tggcggtacg ttactgcctg tgtacggcgc ttgatgaagc cgccaataac acaacctggg 17160 
gacggcgcgg cgtctgggcc ggaaaaagcc tgctggtaac atttcatggt gaaagcgaag 1722 0 
gcgggataaa acttttccag atcatcgggc gtctggcggc cagcttccag gagcatggca 17280 
acgtactgga ggttatctac cacctgctgg ggttgggatt tgaaggccgc tacagcgtgc 1734 0 
agccagacgg gcgtaagcaa ctggacaata ttcgccagca actgctgaca cagctttcac 17400 
agcgtcgcga tccggttatg cccgcgctct cgcctgactt tcagggggcg ataagcggac 1746 0 
gactgcggcg gatgcgccgg gtgccggtct ggctgagcgc cgggatagcc ctgttggcga 1752 0 
tgctgacgct gtttggcctt tacagccacc ggatggatgt gcagaccgtc accgtacaac 17580 
agcatattga tgcgattggt ataaaactgc cgccgccgcc tgtgccggtt cataagctgc 17640 
ggctgaaaat cctgctggca aacgaaatcg cccgtggcct gctgaccgtg gacgaagatg 177 00 
accagcacag tagggtggtc ttccgtggcg acgccatgtt tgtgccggga cagaaaacgg 17760 
tgagtgacgc aatccggcca gtgattaaca aagcggcgcg ggaaatcgcc cgcgtgggcg 17B2 0 
gcgcagtcac tgtaacgggt cacactgaca gccagcccat tcattcggct gaattcccat 178 80 
ccaacctggt actgtcggaa aaacgggcgg cggaagttgc ggcgttgctg acctccggcg 17940 
gcgtacctgc cggacgggta catatcgtcg gcaagggcga tacggtgccg gtggcggata 18000 
acggcagtaa agccgggcgg gcgaaaaacc gtcgggtgga aattctggta gtggagtgaa 18060 
tgaatgatga aaaaatcaac ctatgatgtg tctcatcatt cggcagtatg tggcgtgacg 18120 
ggggattatt atcggatctc agcgacatat cacataacac gatctgttcg tgtttttttg 18180 
atcatcttat gttgcctgtt atccggtggc gcttttgccg gatccccgat taacgcagga 18240 
ttcatttccc ccgataatgt caacctcagt actcaggatt tcctgaaatt ttatgccact 18300 
gacaacgtac agaaaaaaga caatgcactg atgtatatgc tgggggtgga ggatgcgaca 18360 
gaaggtaaag cctggtgtgg atatggtcag gttgacagta taacaataaa ccatactgtg 18420 
ctgacctggt ttgaacagca cgcagtgaaa aagcctgatg taagggcttc aatactaata 184 80 
gaggaagcat tagttaaaaa ttttccctgt cagaggacag actcctccat aaaaattgct 1854 0 
tcccggtcat ctcccatttt atccctgacg ccggatgcgc ttaatctttc aggtaatgac 18600 
ttttttaaat tttgggtgtc tggtaatcaa cgggataaac tcagggcggg tgtctatctg 18660 
ctcggcgtgg aggatgcgac agagaacaaa ctgtggtgtg gatacgcttt atttaagacg 1872 0 
ctaacattaa atgaattagt ctatgtttct cttaaaaata aaaccaatga ggaactgaat 18780 
tctcgcgcgg ctgaacttat cataaataaa ttaatagagt atccctgtaa tatataaaat 18840 
cattcaagtt gcatcaaggc ggcaagggag tgaatccccg ggagcgtaca ttagttcgtg 18900 
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19440 


tggttatatt 


actttatcca 


gggaatttta 


aaatattgat 


tattqqqtqq 


cgagtcgaaa 


19500 


tataatgcaa 


aaatttctta 


gtctgctttt 


ttcccggcgc 


qcqctqqcaq 


ttqtqqqcqt 


19560 


tctggttctg 


gcgctgctgg 


tCtQQtttQt 


caoaccacta 

w 3 3 3 ** *" 3 w *" 3 


qtqtcatttq 


ataccctqco 

^3)^^^3 


1 9620 


cccgctggcc 


tccgtgggta 


accaaataot 


w WCft w ^y 


ctottctctcra 

w w ^ U v ^ b9 CI 




X J w w v 


actgtggctg 


gtcaactggt 


cgatgagtat 


catcggcatc 


agtgtcctgt 


acctaacaat 


19740 


tggcttcgtc 


acaccgctgc 


tggccctggq 


cgatgtccat 


ccgtttgcgc 


cqctctqqqt 


19800 


ccgcctgacc 


ctgattggtt 


tcatcctgct 


qatqtacqca 


ctatacoqcc 


totaccaact 


19860 


gtggcgtgcg 


ctgcgtatgg 


atgaacaact 


ac tacatcac 


ttcctocatc 




19920 


agaggtaccg 


gtggcaggcg 


agatcaaagc 


cgacctgcgc 


accatcaacc 


atattatcae 


19980 


gcaggccatc 


cggcagctgc 


ogcaattaca 


crcrtaaatata 


cctaactaoc 


otaaaatctt 


20040 


/»na fTrtfT.s* ash 
y ciy y y ci cid c» 


Cyct LtctyC 


atgagctgee 


a tacit teat a 


otaatcqaca 

3 ^33 l - X/ 33*" A 


atcccaacaa 


20100 


cggcaaaacc 


acggccctgc 


tqaacaccaq 


attgeagtte 


ccactaacaa 


agcaaatgga 


20160 


gcagacttcg 


cgcatcctga 


cagtaceggg 


tqqcqqcacq 


ctacactgcg 


actqqtqqt t 


20220 


taccaacgaa 


gcggtgttga 


ttgatacege 


cggacgctac 


gcccgccacg 


ataacoqtaa 

3 33 ^33 


20280 


tgaagegage 


gccgcgcagc 


Qtaacqccaa 


aaaataacaa 

& 3™3 L 33 l * a 3 


oqetttctea 


atctactaca 


20340 


taaacatege 


cccggcgcgc 


cqcttaacqq 


cgtgatcctg 


aegctaaacg 


taacaaattt 


20400 


aaccgcacag 


tcaccggcgg 


aacacctaac 


aacctocacc 

3 ^5 w w w 


□ctctacaaq 


cocoa ctcroc 


20460 


agaactgege 


gagaccctgg 


ggattcgett 


teeggtctat 


ctggtggtca 


ccaaaatgga 


20520 


tttgttgccg 


gggttcagcg 


aatattttcg 


cacgctgacc 


agccatcttc 


gtgcacaaat 


20580 


ctggggcttc 


acgttgccgt 


acagccgcag 


gcgaaaagcg 


ggcgacccgc 


aggegctgea 


20640 


cgccgcctgc 


gegcaggage 


tggcgcgcct 


gaegctgegg 


ttggatcagg 


gactggatac 


20700 



i 




ccggccacay 


gacigag uacg 


_l ^ ^ +* ^ 9 9 9 ~h ■ 

aCCtLdddd^ 


ccgccagcgg 


t* 3 ^ 3 

cngLacaccc 


tcccgcgtga 




yttcyccycc 


^ mm>m>>* j** ^ f-m 

cucggcgagc 


cgt. t.gctgga 




f + a fim n W * ^ ^ ^1 

cagacccc cc 


ccgattcaaa 


1 rt 0 0 




aty Ldai^ Ly a 


__ t* _a __ _k i"* /■» t™ 

<a. taawacytL 


gcgcggggcg 


L L t L LCdb v-d 


gcgccgcgca 




yycyL.ayy^^ 


y acggggtgg 




gagcacc ugg 


cagegcut eg 


cccgggcgac 


^ U v*t U 


aaaadVf uy ^ 


eg tggegaac 


CCLCCyCCtC 


uctcccacac 


gccccgccgg 


acggcaaccg 




cagccactLC 


CtyCdLydCC 


tgcugacaca 


Mi^t-4- 

gec tact etc 


cgtgaagcgc 


acctggtgga 




yCCddaCCCC 


cagcgggcct 


ggcgttaccg 


cctgctgcgc 


ctcggcgggc 


acctgctggt 


2 112 0 






tgcggcaggg 


gacgcagacc 


ayCCdyCaya 


c ca a eggega 


A 1 1 a U 


ctatctgaat 


gaaatcagcg 


cccgcgcgac 


ccggctggac 


ggtgatgtga 


aagcctacac 


21240 


cggEaaaccg 


gcgatggct c 


ccgtcccggc 


actgetggae 


agegcaaggg 


aactgtccgc 


21300 


ctggccggaa 


ctggacccgg 


acgcgccgcc 


gctggcctgg 


cgctacggtc 


tgtacagcgt 


21360 


accgccggta 


accgacagcg 


tggegtcget 


gtacaaccgt 


ctgctggatc 


aactgetget 


21420 


3 — <-3*- — 3<- -3 


3*- 3 '"""S3- 




3^-^-33 -33"*- 








Laaaycyycc 


^ 9 9 fr* ,« 

tacgacgccc 


cgcgcaccca 


cctgctaccg 


aacceggata 


aagaccacga 


2 134U 


^ *"i ~* 

ci LadaLdC 


aacgeggegg 


agacccagtc 


gtgggcgact 


aacgaucegg 


^^^v 9 9 9 ^Y — ^ 

ggaacagega 




cagcgtggcc 


gggttcggcg 


ggcgcgccgc 


cgtgctgacg 


catatcgaag 


cgccgcticga 


2 1660 


cggcagccgg 


gtggtgcact 


cacegtatga 


gaaagatgag 


gcgctgatcc 


gccaggcgcg 


2 1720 


ggcattcctc 


gaeggtcaca 


ccagtaccga 


gegtatctae 


gcgcgggcgc 


tggcggcaat 


21780 


ggagagcgaa 


gcgccgcagg 


agttcacgct 


ggtacgcgcc 


gteggegegg 


atgegggaac 


2 1840 


__ 1_ — . a. 4_ 

ggccc ttgtg 


cgtagcaacg 


gcgcgccgct 


ggategggge 


gtgccgggta 


tttttacccg 


21900 


tgaaggatac 


egggagctgt 


tcgacaaacg 


attaceggaa 


^ ^ ^ - ■ 1 . IJIIJIJ IJI 

tttgtggcgg 


cggcgacggc 


2 1960 


gaacgatggc 


tgggtgatgg 


geegggagag 


tacgecaaaa 


aagctgactg 


acagcctgcg 


^ ^ ^ /% 
2 202O 


cagccaga ta 


ceggggcagg 


agcagtctgt 


cgcccgcgaa 


gtccgccgtt 


tgtacctgac 


1 A O 

220B0 


ggaatatgcc 


cgccgctggc 


aggattttcc 


ggacagtatc 


catagtatca 


acagtgeegg 


22140 


ggaagagggc 


agttccggcc 


tggcctatga 


tttacaggtg 


ctgcgcaccc 


tggcgtcgcc 


22200 


ggactcaccg 


ctgatgegge 


tgggaaaagc 


ggtggtggag 


cagaccacgc 
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22740 


gttcgccgac 
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aggtcagcgc 
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cggagtgctg 
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atggcgtcac 


gggatagcgt 


gggacgcccg 
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accgggctgc 
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tctgccgtgg 


ccgggctggc 
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gctggggcag 
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gttctggcag 


cagaacagtg 
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tgtggatgcg 
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aaatagaggt 


cagatgctgt 
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ccttaccatt 
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cggtcacact 


gaaaaaaagc 


accggtgaat 
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gtggctgttt ctcaacatta atcaactata ccagcttact ggcgcaggat gttgagggtt 24420 
acggacgcgg aagtgcatat tcccttcagg aggggaatat ctcactggcg aagatgcagg 244 80 
gacgttatcc tttcagcttt gtattatcgg tggataacca gtctgtacgg gatcagaagc 24 54 0 
tggcgctaat gatacgttgt acaccaccgc tgtaatacac agaatagtca gggagaagat 24600 
gatggcagta agactgactt ttgacgggca aaagctgaca tggcctggta tcgggatatt 24660 
taaggcgacc acggggttac cggatttaca gtggccagat aaacagtgtg tgccggatgc 24 720 
ggcgataccg gaagggaatt ataaattgtt tattcagttt cagggggagg caccgataag 24780 
aaatgctgcg gattgtgatc tgggaccatc atggggctgg agtaccattc cgcgaggcca 24B40 
ggctgccgga acatgtgaga tatactgggc gaactgggga tataatcgta tccggctgga 24 90 0 
atcagcggat gagaagaccc gaaaagcctg tgggggcaag cggggtggtt tttatatcca 24 960 
tgattccacc aaaggttaca gtcatggttg tattgaagtg gaaccggtgt ttttccgtat 25020 
tctgaaacag gagacggaaa aagaaaatgg tgaaaagaca tttacggtta atgttaagta 25 080 
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attatcgtgt ctggacgtgc tgtggttgac aatgattaca atggtggcca gataagtaat 2 5200 
gagcaccaat gataaagtat atgactggcg gtgtgctggt tattaatcaa atttctataa 25260 
aatgcaatgc gaatctggta agcgtaataa aaataaaatc ataatattgt tatgttcatt 25 320 
tccttattta tgtaattcag tatttatgtt atgtgctaat ttttgtgttt ttattttcat 253 80 
ggctcttgtc agcaatatac cctgttcttc tggtaaatat ttaaatataa caggctggtt 25440 
gcattataaa gtgcggggca ctgtttcctg acggtgagtc tatttatttt aatccggtat 25500 
taaaggagtc actaccatga gttttgtatc cacaaataat aaatccggta tgggagggct 25560 
gacgacaacc acgccgccga taaccggaga aagtggcggt gtcaccgcag attcagtcgc 25620 
cggaagcgtg gcagatgcgg cggaatccgc cgtggaacag gctgcgggat cgctatttgg 25680 
cgcattgccg gagccatcag gactggtgaa agccgcggta gcagcggcgc aggctgccgc 2574 0 
cgccgcaggt atggcgcagg atgcggtatc ggccatcgtc tctgctgttg caggcgggcc 25800 
gggggcgcat aatgtgacgg tcagcggcag cgccgtaccg ccgggcgcat tactgttcgc 25860 
cagcctggac ggcggcgaaa cattaagtga actgttcagc tatgtggtac agctaaaaac 25920 
gcccgacacc ctgaatctgg gctatgtctc cccggcggcc aacctgccgc tcaaaccgat 25980 
ggtgggcaaa gatctgtgcg tcaacatcga actggatggt ggcggtaaac gacatatcag 2 604 0 
cgggctggtc acggcggcgc gggtggtggg ccatgaaggg cgttcggtta cctatgagct 26100 
gcgtatggag ccgtgggtaa aactgctgac ccataccagc gactacaaag cattccagaa 2 6160 
taaaaccgtg gtggatattc tggatgaggt tctggcggaa tatccctacc cggtggaaaa 26220 




gcggctggtg gaaagctacc cggtacgcac ctggcaggtg cagtacggtg aaactgattt 26280 
tgattttctt cagcgactga tgcaggagtg gggcatctac tggtggtttg agcacagcga 26340 
ggacagccac acgctggtgc tggcggatgc catcagcgcc cacaaagcat gtccggactc 264 00 
gccgctggtc gagtggcacc aggaagggct gaagctggac aaggagttta tccacactat 264 60 
cacggcaaac gagagcctgc ggactggaca gtgggtgctg gatgatttcg attttacgaa 26520 
gccacgttca ttgctggcaa acaccgtggc aaacccgcgt gaaaccggtc atgccaccta 265 80 
cgagcattat gagtggccgg gagactactt cgacaagagt gaaggcgaga tgctgacgcg 2664 0 
cattcgtatg gaagcgcagc gcagccccgg cagtcgggtg ctggggggag ggaatatccg 2 670 0 
cacactcatg accggttata ccttcacgct ggaaaactat cccaccgccg aagtcaatca 26760 
ggaatatctg ctgatgcaga ccttgctgtt tgtgcaggac aacgcgcagc acagcgggca 2 6820 
ggaccagcac tttacctttt ccacccgttt tgaactgcac cccacccgcg aggtgttccg 2 6880 
cccgcagcgg acggtgagca aaccccacac caaagggccg cagagcgcca tcgtcaccgg 26940 
cccggcgggc caggaaatct ggacggatca gtacgggcgg gtaaaggtac agtttggttg 2 700 0 
ggatcgctac ggcaaaatgg atgaaaacag cacctgctgg atacgcgtca gctacccgtg 2 7060 
ggcgggcaaa ggcttcggga tgatccagat cccgcgtatc ggccaggaag tgctggtgga 27120 
tttcaaaaac ggcgatccgg atctgccgat catcgtgggg cgtacctaca accaggacac 2 7180 
catgccgccg tggggactgc cgggaatggc gtcgcagagc gggatcttca gccactcgct 27240 
gtatggcggg ccaacgaacg gcaacatgct gcgttttgac gacaaaacgg gcgcggagga 2 7 300 
agtgaagttc cacgcggaaa aagatctcaa caccacggtg aagaataatg aaacgcatac 2 7360 
ggttatggtg gatcgcacta aaaccattat taaaaatgaa accaacagta ttggtgagga 2 7420 
cagaaacacc acggtaacga agaatgacgg cctttccgta aaactggcgc agacgatcaa 27480 
tatcggcacc acttatcgtt tagatgttgg cgatcaattc acgcttcgct gcggcaatgc 27540 
ggcgcttgtt ttacataagg acggctccat tgagttttgt ggcaagcaac tgatgttaca 27600 
taccagcgat gtcatgcaac tgattggtaa aggtattgat atgaacccgg atggcggcac 2 7660 
agccgtaacc gccgatgata ttgcccccct tctcacctct gagtgatctg aattaaacct 2772 0 
ggagttctca tggatcgacc ataccgcata caggaagggt gttttgtcct gcctgaaaca 27780 
tttacggatc gcagcgtcaa tatttttatc ctggagggca atgaacgaac atcgcccagc 2 784 0 
ctgaatattt cccgcgatac gctaaaacct gatgaagacc tgcccgccta tattgaccgc 27900 
cagattgcac tgatgaaaaa aaatctcggt cagcaccggg tattgtcgcg agcgcctgca 2 7960 
caggcaggaa cgggcaatga tgcccttatg ggggaacaaa ttgccgccac ccataaatcc 28020 
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gcggggaacg 


gtagcatatg 


ttatcagcgc 


30300 


gacgcgctgg 


gcaacccgac 


ggacatcacg 


ctgccggacg 


ggcagcacct 


gacgcatctg 


30360 


tattacggga 


gcgggcatct 


gttacagacg 


gcgctggacg 


gcctgacggt 


gagcgagtat 


30420 


gagcgcgaca 


gcctgcaccg 


tcagataatg 


cgcacgcagg 


ggcagcttgc 


gacgtacagc 


30480 


ggctatgacg 


acgacgggct 


gctgagctgg 


cagcgcagtc 


tggcgtccgg 


cagtgcccct 


30540 


gttcttcctg 


gccagcgccc 


ggcgcggcag 


ggctgcgtga 


cgtcgaggga 


ctattactgg 


30600 


aacaaccacg 


gcgaggtggg 


cacgattgac 


gacggcctgc 


gtggcagcgt 


ggtgtacagc 


30660 


tatgacagaa 


gcggttacct 


gaccgggcgc 


tcaggtcaga 


tgtatgacca 


tgaccgttat 


30720 


tattacgata 


aggcgggcaa 


cctgctggat 


aacgaagggc 


agggagcggt 


gatgagcaac 


30780 


cggctgccgg 


gctgtggtcg 


tgaccgttac 


ggctataacg 


agtggggcga 


gctgaccacg 


30840 


cggcgcgacc 


agcaactgga 


gtggaacgcg 


caggggcagc 


tgacgcgggt 


catcagcggc 


30900 


aacacggaga 


cgcactacgg 


ctacgatgcg 


ctggggaggc 


gaacccgcaa 


ggcgacgtac 


30960 


gggcggcaca 


cgggccatac 


ggcgcggagc 


cggacggact 


tcgtgtggga 


ggggttcagg 


31020 


ctgttgcagg 


agaacgtgca 


gcagcagggc 


tggcggacct 


atctgtacga 


tgcggaacag 


31080 


ccgtacacgc 


cggtggcgag 


cgtgacgggg 


cggggagaaa 


gcaggcaggt 


gtggtattac 


31140 


cacacggatg 


tgacgggcac 


gccgcaggag 


gtgacggcgg 


cggacggaac 


gctggtgtgg 


31200 


gcggggtata 


tcagggggtt 


tggagagaat 


gcggcggaca 


tcagcaacag 


cggggcgtac 


31260 


tttcaccagc 


cgctgcggct 


gccggggcag 


tattttgacg 


acgagacagg 


gctgcattac 


31320 


aatctgttca 


gatattatgc 


accggagtgt 


ggacggtttg 


tcagtcagga 


tccgatcggg 


31380 


ctgaggggcg 


ggttaaacct 


ttatcagtat 


gcgccaaatc 


ctctcaaata 


tatagaccca 


31440 


cttggtttaa 


ccgcgactgt 


tgggcgatgg 


atggggcctg 


cggaatatca 


gcaaatgctt 


31500 


gatactggga 


cagtagtaca 


aagttcaaca 


gggacaactc 


atgttgccta 


ccctgctgat 


31560 


atagatgctt 


ttggtaagca 


agcaaaaaat 


ggtgctatgt 


atgttgaatt 


tgatgtgcct 


31620 


gaaaaatcat 


tagtacctac 


aaatgaagga 


tgggcaaaaa 


tagtagggcc 


agattctatc 


31680 




gaagggcgat 


tagctaaacg 


caaaggtttg 


cctgttcctg 


aaatgccaac 


agcagaaaac 


31740 


ataactgtaa 


ggggcgagaa 


aattaatggg 


gaagttgaag 


caaaatgcta 


aataaattta 


31800 


aattgtgggt 


gagcaaacat 


actgattata 


cggtaattca 


taatgaaaat 


gatttatctt 


31860 


a cag c a c l. a t 


tatagattLt 


gaagatgacc 


ggtatatatc 


aagatttact 


gtatgggatg 


3 1920 


dCCCaayCLg 


ca cgccagaa 


gtaatggatg 


tggatactgg 


tttatataaa 


ttaaacaaga 


31980 


gaaacgaacc 


ccccacattt 


gatgaacttc 


tggatatatt 


tgatgatttt 


atgataagta 


32040 


t taaataata 


gttggccggg 


taagaagtta 


actcttcccg gctgttttat 


tatctaaccc 


32100 


ccatcaatcc 


ggagacgcgc 


taccggtacg 






agcaaggcga 


32160 


cy cacyggcy 


gcacacgggc 


catacggcgc 




aaa. c 1 1 1 a t cr 

3 3 aVj ^ ^3 u 3 


tgggaggggt 


32220 


tcaggctgtt 


gcaggagaac 


gtgcagcagc 


a aaa c t aaca 


gacatatctg 


tacgatgcgg 


32280 


aacagccgta 


cacgccggtg 


gcgagcgtga 


ccraQaaacraa 


aaaaaocacra 


caggtgtggt 


32340 


attaccacac 


ggacgtgacg 


ggcacgccgc 


aooaaa taac 


aacaccaaac 


ggaacgctgg 


32400 


tgtgggcggg 




SSgtttggag 


w 3^^Z3 3 


i J t j S 1 rtCi, 

^3 ****** ^— **^j>r 


aacagcgggg 


324 60 


cgtactttca 


ccagccgctg 


cggctgccgg 


*-j ^4 V O W 


taanaa rnaa 

\* a n ~j o 


acagggctgc 


32520 


attacaatct 


gttcagatat 


tatgcaccgg 


aa t a tcrcra ca 


attcatcaat 

W v w V^" V* W 


caggatccga 


32580 


ttgggctggc 


ggggggggct 


gaatctttac 


caatatorar 




tagatggatc 


32640 


gatcctttag 


gacttgctat 


cctggagcat 


caatetaatt 


ttaatacaoc 

l» a %j ^* z3 z3 


aaggagaacc 


32700 


ggatttgaaa 


atgcgggtat 


gacaaaccct 


aaaaatatca 


ctttctcaaa 


agtcgatccc 


32760 


aaaactggta 


ctgttgttga 


gtttaaaggt 


ccaaatgggg 


ctaaaattac 


ttatgatgca 


32820 


cctcatgcag 


atatggatgt 


gacagcaggg 


catgataaac 


cacatottao 


ttggcaatcc 


32880 


gcaggaaaaa 


gaggttccgg 


aggagctaat 


agaggtaata 


ttacttatga 


tggcccacaa 


32940 


catccgcatc 


gctctgactc 


taagggagat 


gataaatgtt 


aaattcaaat 


atgtctgaac 


33000 


ttagaatcga 


actggagaat 


gcgattaaaa 


atctcggtat 


tcatgattat 


cgtgtcgata 


33060 


aacccgaaca 


aatcgtttct 


gagataaaag 


agatatatgt 


taatggtaat 


cctagaacct 


33120 


ggtggttatc 


attaaaacat 


agacaatatg 


tcttttctta 


taccgataat 


tctggatata 


33180 


aaaacatatc 


acaaatagta 


agtaaacaac 


tcaatgaaag 


caatgtaatc 


aacaaacata 


33240 


tatttttgat 


tgctgatgaa 


gataatgagc 


aaatatatgt 


atataacgtt 


cctcttaact 


33300 


ccctgcctga 


aattatagaa 


aattgcagat 


attttgaata 


ttatgttgca 


gatcatgaac 


33360 


tatcttggct 


tatatgtgaa 


aatgatcatg gtgatttgat 


tgtatgctca 


accattaagt 


33420 


aaagcgcgag 


tgctctttag 


cgatatagtt 


gcccatattt 


aggcgttact 


agccgaagat 


33480 


ggcgcgattg 


tctggcaggg 


gaaacagcaa 


ttctgaggtc 


aggaagatag 


cataacccat 


33540 



taaccgggat 


agatccgcta 


gacctgaatc 


cagttgatgc 


gacaggttat 


agggtttatg 


33600 


gttatttgct 


cctggagcaa 


ataaacctta 


ttacattggt 


attactaatg 


atatggtttg 


33660 


acgaagggcc 


gagcattaaa 


gcactggcag 


gttatcaaaa 


gaaaatggaa 


ggatgctgcc 


33720 


atttgatgaa 


aatgtaatct 


aatggaaagt 


cagaggttac 


gagaaatatt 


atatagagaa 


33780 


atataaaacc 


agaaccggaa 


ccataggtga 


aaaaattccc 


tcaacaaata 


gagagaataa 


33840 


atataattca 


tttgatcatg 


ggcgaacaga 


tcccagcgca 


caagcattta 


aagactctta 


33900 


aaatagtaag 


ggagt tggtt 


ccggtggagg 


aaaatgcgga 


tgagtgatta 


agaattttgg 


33960 


ggctgtgata 


agaagtcgag 


aacaatgctg 


cgttttgtga 


agcccggaga 


catattttgt 


34020 


tttaaattag 


atgaagatag 


atattgtttt 


gggcgaatta 


taacactaat 


gactgtcgga 


34080 


catctttctg 


aattatttga 


tataattaaa 


aaaccccctg 


gaataacaga 


gttagaaatt 


34140 


agtaatgcaa 


ggcgaattat 


tgaaccaatt 


atagtggata 


catattcttt 


atttgataag 


34200 


aaattagaaa 


atggaagtga 


ctggagaatt 


attggtcatc 


aggttaatta 


caatccaaaa 


34260 


aatttagatg 


gtatctattt 


tgcacttgga 


ataggtgatt 


cctgtaaaaa 


gaaagactgt 


34320 


tacggaaatg 


attttctcat 


ttcagaaagt 


gagtggaaaa 


cacttcctaa 


attatctcct 


34380 


aaagggggtt 


ttgatatcaa 


aaaacggctt 


gaaattgcct 


gaaaatgaaa 


ataaaaagcc 


34440 


gggaaagatc 


ttttgtcttc 


ccggatttta 


ttatttaatc 


cccgttcacc 


acattattta 


34500 


cccccgcctt 


aatatgcttc 


atcgactttt 


tcacctgata 


aagctccttc 


cgtagatccc 


34560 


tcacttcgtc 


cgtctctgca 


atcaggatca 


aacacccctc 


ggagatcttc 


acggtgacgc 


34620 


cgtgcccggt 


ctcaaatccc 


gcttcttcct 


gccagtcacc 


cttaaggtgc 


tggctgggga 


34680 


tttgtgagta 


acaggcggtc 


atgcaggttt 


cgctgttgat 


atggcggacg 


ctgacgccgg 


34740 


aggcattcat 


atttgctgac 


taaataaatt 


cttatttatc 


cgccggatgc 


tggctgattg 


34800 


tggagctcag 


ggtgagtgag 


tatgggcgcg 


acatcctggc 


accgcgcctc 


cctctccccg 


34860 


gccagccccg 


gccggtgatg 


agcagccggc 


tgccggggcc 


gtattttgac 


gatgaaacgg 


34920 


gcctgcatta 


aaatctgttc 


agatattatg 


taccggagtg 


tggctggttc 


gtcagtcagg 


34980 


atccaatagg 


gctgaaaggg 


ggatggaacc 


gatatcattc 


tccgctgaat 


cctattacag 


35040 


atagtgatcc 


tcttggcctt 


attacttgtg 


gtgctgatag 


aggtgattct 


ggcaagttat 


35100 


taagatgaaa 


aatggttgaa 


aataattgcg 


ttagtgaatg 


ctcatttatg 


cctttaccta 


35160 


caaaaagtaa 


cggatttgct 


tgctggaatt 


gtgttaacga 


atgcaaataa 


aatcaatgct 


35220 


gttgatgttt 


ttactacagg 


aataagtatg 


gttaacgata 


aggatacagc 


tatattaatt 


35280 


agtaatttaa 


tgttgagatt 


cggtaaggag 


cttgatgaat 


ctgttgctgt 


tgttcagtcc 


35340 




cgttgtgatg aggatgaatt taatgtatat cgagaaacgg ttggttttat catgggtgaa 35400 
atgcttatta aaataatgaa tccattatat gaaaaacatc cagaaataaa accaaaagga 35460 
ttgaaacaaa acatctggaa ccggatgaat aatgtgtaaa agccggaggg gttatctttt 35520 
cccggctttt tattatcaat tactcattaa ctcctgttcc gttcttttgc gtttaatcac 35580 
cggaatatct ccggtattgt tcagcgcccc ggaaatgttt ttaaccactg ttctgcactc 3564 0 
cgtttattaa tgcgggttac gcccatccct tcaatacagc caaagagtcc gtgggtatgc 35700 
tgcggcgtga tcacgatgca atccctcatt acccgcacct tgacgggcat cccgttaata 35760 
aaccccgcct gcggcatcca ttcaccgtac aaacgaactg aaagtctctc aacgcgtgag 3582 0 
tatgtaagta tcccgcataa tcgagccatt cacatttaga gatcatccga cataatcaat 35880 
ctgccaacgc aggagatcgc tatgcgtaaa gcccgtatta ctgcgcacca gatcatcgct 3 5 940 
gtgattagat cagtcgaatc cggacggact gttaaagatg tctaccggga gcccggtatt 36000 
tctgaagcca ccagggacaa ctggaggtct ggatacggcg gcagggatac gcgtggaatc 3606O 
acaaaaggct ccaccgtatt tactgtctgc tcaagctgaa tttfccgccgt aagggtaaac 3 612 0 
aacggctgcc ggtacgcaat ccctcgccac tggtcacgcc ggaagcgctg aaccagagct 3618 0 
ggtctgtggg cgtcgttttc gcacgttcaa tgttgttgat gactgtaatc gtgaagcgtt 3 624 0 
gtcgattgaa atcgatctga atctgccagc tctgcgagtg gtccgtgtac tcgacaggat 36300 
tacagcaacc gcggttatct ggccatgctg cgtatggata agggaccgga atttatctcg 36360 
ctggcactgg ctgaatgggc aaagaaacat gcagtaaagc tggcgtttat ccagccgggt 36420 
aagccgaaga aaaacgtttt catcacgcgc tttaaccgga cataccgtac agaaatactc 36480 
aattcttatc tgttcagaac gctgaatgag gtgtgggaaa ttacggataa agggttatca 36540 
gaatataact gcgaacgtcc acatgaatcg cggaacaata tgataccgaa ggaataccgc 36600 
caataacgtt atctggccgg aatcttaaaa atgcatggaa ctaaaacggg tctatttaca 36660 
ggggcacctg cgatgaattt cgctgcactg aaaagcgata ccggatgaga gctgcttcaa 3672 0 
attaatgtgc catgttcacg gggaggttgt gcgacgtttg cataatccag caagaactga 3 6780 
aaggaagggg agagcttttt catgcctgta taatcagtct ggcctgtgtc agtcagctct 36 84 0 
tagtgttgag actctcgttg gagcgttata attgcttttc tgtttcggaa aacaagattt 3 6900 
tccattaaag atcttccctg cgaggaaaag ttaactaata atcttaccgt cgagttagga 36960 
gatgtatgtt taaatataaa caatgttgca acgatgcctg ataattatcc tctcttcgaa 37020 
gataagtttc ccacacccag tgtagtaggt gtcatggtaa tgttatcact tgaatgtaaa 3 7080 
tggaaggtat aattgctttt tgactggcat tctattccac cctgacaaca cgatgttaac 37140 
atcaacactg tttatattgg caataacgca atttttttca gattaagagg tgctctgata 37200 



tatagatttt tatgacatta cttatttgaa ttggtaacaa ataccaataa gtacaagctg 37260 

ttattaccag ccacggattt tttacatacg gtaagatttg gtatggcgtt atgtattctg 37320 

gatgtgctgg attattttaa tttggtttaa aaaaggtggt tattcaa atg aaa age 37376 

Met Lys Ser 
1 

ata aaa aaa ttg att ate gca agt gcg ttg age atg atg get get agt 37424 
lie Lys Lys Leu He He Ala Ser Ala Leu Ser Met Met Ala Ala Ser 
5 10 15 

tgt tat get ggc tea ttt ttg ccg aac tea gag caa caa aaa tea gtg 37472 
Cys Tyr Ala Gly Ser Phe Leu Pro Asn Ser Glu Gin Gin Lys Ser Val 
20 25 30 35 

gat att gtg ttt tec tct ccc caa gat tta acc gta teg ctt att cca 37520 
Asp He Val Phe Ser Ser Pro Gin Asp Leu Thr Val Ser Leu He Pro 
40 45 50 

gtg teg ggc tta aag get ggg aaa aat get cct age gcg aaa att gcg 3 7568 
val Ser Gly Leu Lys Ala Gly Lys Asn Ala Pro Ser Ala Lys He Ala 
55 60 65 

aag ctt gta gtt aat tct act act ctt aaa gaa ttc ggg gtc agg ggg 37616 
Lys Leu Val Val Asn Ser Thr Thr Leu Lys Glu Phe Gly Val Arg Gly 
70 75 80 

att tct aac aac gtg gta gac agt act ggc act gca tgg cgt gta get 37664 
He Ser Asn Asn Val Val Asp Ser Thr Gly Thr Ala Trp Arg Val Ala 
85 90 95 

ggt aaa aat act ggt aaa gag ate ggt gtg ggc tta tea agt gac agt 37712 
Gly Lys Asn Thr Gly Lys Glu He Gly Val Gly Leu Ser Ser Asp Ser 
100 105 110 115 

ctt aga aga tct gat age acg gaa aaa tgg aat ggg gtg aac tgg atg 37 760 
Leu Arg Arg Ser Asp Ser Thr Glu Lys Trp Asn Gly Val Asn Trp Met 
120 125 130 

acc ttt aat age aat gac aca ctt gat att gtc ctg aca gga ccg gcg 37808 
Thr Phe Asn Ser Asn Asp Thr Leu Asp He Val Leu Thr Gly Pro Ala 
135 140 145 

cag aat gtc aca get gac acg tac cca ata act tta gac gta gtg gga 37 856 
Gin Asn Val Thr Ala Asp Thr Tyr Pro He Thr Leu Asp Val Val Gly 
150 155 160 

tat caa cct taa tagtaaacaa ctattagtgt attgtgcctt gtttaaggcg 3 790 8 

Tyr Gin Pro 
165 

caatacacat caaatcatct atttttcttt tacaattttt gat atg aaa ata gtt 37 963 

Met Lys He Val 
170 

aat ttt get gta atg gcg gta get ttg ttc gec act aat tct atg gtt 3 8011 
Asn Phe Ala Val Met Ala Val Ala Leu Phe Ala Thr Asn Ser Met Val 
175 180 185 



% 
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tea gta tat gec gtc aac cag caa tta aat tea gec act aaa tta ttc 38059 
Ser Val Tyr Ala Val Asn Gin Gin Leu Asn Ser Ala Thr Lys Leu Phe 
190 195 200 

age gtg aag ctg ggg get aca cga gtg att tat cac get ggt acg get 38107 
Ser val Lys Leu Gly Ala Thr Arg Val lie Tyr His Ala Gly Thr Ala 
205 21D 215 

gga gcg acg etc teg gtg age aac ccg cag aat tac cct att ttg gtt 3 8155 
Gly Ala Thr Leu Ser Val Ser Asn Pro Gin Asn Tyr Pro lie Leu Val 
220 225 230 235 

cag tct tea gtc aaa gca gca gac aaa agt teg cct get ccc ttt ttg 38203 
Gin Ser Ser Val Lys Ala Ala Asp Lys Ser Ser Pro Ala Pro Phe Leu 
240 245 250 

gtg atg ccg cct eta ttt cgt tta gaa gca aac cag cag agt caa ctg 38251 
Val Met Pro Pro Leu Phe Arg Leu Glu Ala Asn Gin Gin Ser Gin Leu 
255 260 265 

cgt att gtc cgt act ggt ggt gac atg cca acg gat cgt gag act tta 382 99 
Arg lie Val Arg Thr Gly Gly Asp Met Pro Thr Asp Arg Glu Thr Leu 
270 275 280 

cag tgg gtc tgt ata aag gcg gta cca ccc gaa aat gaa ccg teg gat 38347 
Gin Trp Val Cys lie Lys Ala Val pro pro Glu Asn Glu Pro Ser Asp 
285 290 295 

aca cag get aag ggc gcg acc ctt gac etc aat ttg tec ate aac gec 383 95 
Thr Gin Ala Lys Gly Ala Thr Leu Asp Leu Asn Leu Ser lie Asn Ala 
300 305 310 315 

tgt gat aag ctg att ttc cgc ccg gat gec gtg aag ggg acg ccg gaa 38443 
Cys Asp Lys Leu lie Phe Arg Pro Asp Ala Val Lys Gly Thr Pro Glu 
320 325 330 

gat gtt gca gga aat tta aga tgg gtg gag acg ggc aac aaa ctt aag 3 84 91 
Asp Val Ala Gly Asn Leu Arg Trp Val Glu Thr Gly Asn Lys Leu Lys 
335 340 345 

gtg gag aac ccc acc ccg ttt tac atg aat tta gec tct gtc aca gta 38539 
Val Glu Asn Pro Thr Pro Phe Tyr Met Aen Leu Ala Ser Val Thr Val 
350 355 360 

ggg gga aag ccc att aca ggg ctt gag tat gtc ccc ccc ttt get gac 38587 
Gly Gly Lys Pro lie Thr Gly Leu Glu Tyr val Pro Pro Phe Ala Asp 
365 370 375 

aaa aca eta aat atg cca ggt agt gec cat ggt gat ate gag tgg aga 38635 
Lys Thr Leu Asn Met Pro Gly Ser Ala His Gly Asp lie Glu Trp Arg 
380 385 390 395 

gtt att aca gac ttt ggt ggt gaa agt cat ccg ttc cac tac gtt ctt 38683 
Val lie Thr Asp Phe Gly Gly Glu Ser His Pro Phe His Tyr Val Leu 
400 405 410 

aaa taa atccaggggc teageggcag aaa atg aag ttc aaa caa cct gec ttg 3 873 6 
Lys Met Lys Phe Lys Gin Pro Ala Leu 

41S 420 



eta ctg ttc ate gcg gga gtg gtt cat tgc gca aat gcg cac act tac 3 8784 
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Leu Leu Phe lie Ala Gly Val Val His Cys Ala Asn Ala His Thr Tyr 
425 430 435 



aca ttc gat gca tea atg ttg ggc gat gca gcg aaa ggg gtt gat atg 
Thr Phe Asp Ala Ser Met Leu Gly Asp Ala Ala Lys Gly Val Asp Met 
440 445 450 



38832 



teg etc ttt aac cag ggg tta caa cag cca ggg act tat cgc gtg gac 
Ser Leu Phe Asn Gin Gly Leu Gin Gin Pro Gly Thr Tyr Arg Val Asp 
455 460 465 



38880 



gtg atg gtg aat ggt aaa cgt gtc gac acc cgt gat gtg gtg ttc aaa 
Val Met Val Asn Gly Lys Arg Val Asp Thr Arg Asp Val Val Phe Lys 
470 475 480 485 



38928 



ttg gaa aag gat ggg caa gga acg cct gtt ctg get cct tgt ttg acg 
Leu Glu Lys Asp Gly Gin Gly Thr Pro Val Leu Ala Pro Cys Leu Thr 
490 495 500 



38976 



gtc agt cag ctt tea cgc tac ggc gta aaa acg gaa gat tac cct cag 
Val Ser Gin Leu Ser Arg Tyr Gly Val Lys Thr Glu Asp Tyr Pro Gin 
505 510 515 



39024 



ttg tgg aaa gca gca aag ccc cca gat gag tgt gcg gat ctg acc gee 
Leu Trp Lys Ala Ala Lys Pro Pro Asp Glu Cys Ala Asp Leu Thr Ala 
520 525 530 



39072 



att cca cag get aaa gcg gta ctg gat ate aat aat cag caa ctg caa 
lie Pro Gin Ala Lys Ala Val Leu Asp He Asn Asn Gin Gin Leu Gin 
535 540 545 



39120 



ctg agt att ccg cag ttg gcg ttg cgt ccg gaa ttt aag ggg ate get 
Leu Ser lie Pro Gin Leu Ala Leu Arg Pro Glu Phe Lys Gly He Ala 
550 555 560 565 



39168 



cca gaa gat ctt tgg gat gat ggt att ccg gcg ttt ctg atg aac tac 
Pro Glu Asp Leu Trp Asp Asp Gly He Pro Ala Phe Leu Met Asn Tyr 
570 575 580 



39216 



agt gcg agg aca acg cag acg gat tac aaa atg gat atg gtg ggg cgt 
Ser Ala Arg Thr Thr Gin Thr Asp Tyr Lys Met Asp Met Val Gly Arg 
585 590 595 



39264 



gac aac tct tec tgg gta caa ctg caa ccg gga ate aat ata ggt gcg 
Asp Asn Ser Ser Trp Val Gin Leu Gin Pro Gly He Asn He Gly Ala 
600 605 610 



39312 



tgg cgt gtc cgc aat gcg acc age tgg cag egg agt agt caa ctg teg 39360 
Trp Arg Val Arg Asn Ala Thr Ser Trp Gin Arg Ser Ser Gin Leu Ser 
615 620 625 

ggg aag tgg cag gca gca tat acc tat get gag cgt gga ctg tac tea 3 9408 
Gly Lys Trp Gin Ala Ala Tyr Thr Tyr Ala Glu Arg Gly Leu Tyr Ser 
630 635 640 645 

eta aaa agt cgt ctg act ctg ggg caa aag act teg cag ggg gag ata 3 9456 
Leu Lys Ser Arg Leu Thr Leu Gly Gin Lys Thr Ser Gin Gly Glu He 
650 655 660 

ttt gat agt gtg cca ttt acc ggt gtg atg ttg gca teg gat gac aac 39504 
Phe Asp Ser Val Pro Phe Thr Gly Val Met Leu Ala Ser Asp Asp Asn 
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665 670 675 

atg gtg ccc tac agt gag egg cag ttt get ccg gta gtg cgt ggg att 3 9552 

Met Val Pro Tyr Ser Glu Arg Gin Phe Ala Pro Val Val Arg Gly lie 
680 685 690 

gee cgc acg cag get egg gtg gag gtc aaa cag aat ggt tac ace att 3 9600 

Ala Arg Thr Gin Ala Arg Val Glu Val Lys Gin Asn Gly Tyr Thr lie 
695 700 70S 

tac aac ace act gtg gcg ccc gga ccg ttt gca ctg egg gat ctg teg 39648 

Tyr Asn Thr Thr Val Ala Pro Gly Pro Phe Ala Leu Arg Asp Leu Ser 

710 715 720 725 

gta aca gac agt agt ggt gat ctg cat gtc ace gtg tgg gag gec gat 3 9696 

Val Thr Asp Ser Ser Gly Asp Leu His Val Thr Val Trp Glu Ala Asp 
730 735 740 

ggc agt aca caa atg ttt gtg gtg ccg tat cag acc ccg gcg ata gca 39744 

Gly Ser Thr Gin Met Phe Val Val Pro Tyr Gin Thr Pro Ala lie Ala 
745 750 755 

ctg cac cag gga tat ttg aag tac age ctg ttg gcg ggc cga tac cga 3 97 92 

Leu His Gin Gly Tyr Leu Lys Tyr Ser Leu Leu Ala Gly Arg Tyr Arg 

-. ~ •">/•- t- rj1f\ 
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teg tea gac tct gca acg gat aag egg cag ate gcg cag get acg ttg 39840 

Ser Ser Asp Ser Ala Thr Asp Lys Arg Gin lie Ala Gin Ala Thr Leu 
775 780 785 

atg tat ggt ctg ccg tgg aat etc act gca tac ggc ggt ata cag agt 3 9888 
Met Tyr Gly Leu Pro Trp Asn Leu Thr Ala Tyr Gly Gly lie Gin Ser 

790 795 800 805 

gca acg cat aat caa get gca ttg ctt ggt ttg ggg gga tct etc ggg 39936 

Ala Thr His Asn Gin Ala Ala Leu Leu Gly Leu Gly Gly Ser Leu Gly 
810 815 820 

egg tgg ggg agt tta tct gtc gat gga age gac aca cac agt cag cgt 3 9984 
Arg Trp Gly Ser Leu Ser Val Asp Gly Ser Asp Thr His Ser Gin Arg 
825 830 835 

cag ggg gag gcg gta cag caa gga gee tec tgg cga ctg cgt tac age 4 0032 
Gin Gly Glu Ala Val Gin Gin Gly Ala Ser Trp Arg Leu Arg Tyr Ser 
840 845 850 

aac cag ctg act gcg acg ggg aca aat ttt ttt ctg acg aga tgg cag 4 0080 
Asn Gin Leu Thr Ala Thr Gly Thr Asn Phe Phe Leu Thr Arg Trp Gin 
855 860 865 



tat gee teg cag ggc tat aac acc eta tec gat gtg etc gac agt tat 
Tyr Ala Ser Gin Gly Tyr Asn Thr Leu Ser Asp Val Leu Asp Ser Tyr 
870 875 880 885 

cga cat aat ggc aac cgt eta tgg teg tgg egg gaa aat ttg cag ccg 
Arg His Asn Gly Asn Arg Leu Trp Ser Trp Arg Glu Asn Leu Gin Pro 
890 895 900 

age teg cgt act acc ctg atg ttg agt cag tea tgg ggg agg cat ttg 
Ser Ser Arg Thr Thr Leu Met Leu Ser Gin Ser Trp Gly Arg His Leu 
905 910 915 



40128 



40176 



40224 



ill 

# 
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ggc aat ctg agt tta acc ggt tec cgt acc gac tgg cgt aat cgc ccc 4 0272 
Gly Asn Leu Ser Leu Thr Gly Ser Arg Thr Asp Trp Arg Asn Arg Pro 
920 925 930 

ggt cat gat gac age tac gga ctg agt tgg gga acc tct ate gga ggg 4 0320 
Gly His Asp Asp Ser Tyr Gly Leu Ser Trp Gly Thr Ser lie Gly Gly 
33 5 94 0 94 5 

ggc teg ctg tea ttg aac tgg aat caa aac aga acg ctg tgg cgc aat 4 0368 
Gly Ser Leu Ser Leu Asn Trp Asn Gin Asn Arg Thr Leu Trp Arg Asn 
950 955 960 965 

ggc gcg cac cgt aaa gag aac ata acc age ctg tgg ttc agt atg cca 40416 
Gly Ala His Arg Lye Glu Asn He Thr Ser Leu Trp Phe Ser Met Pro 
970 975 980 

tta age cgc tgg acg ggg aat aat gta agt get agt tgg cag atg act 40464 
Leu Ser Arg Trp Thr Gly Asn Asn Val Ser Ala Ser Trp Gin Met Thr 
985 990 995 

tea cca tea cac ggt ggt cag acg caa caa gtg ggg gtc aac gga gag 4 0512 
Ser Pro Ser His Gly Gly Gin Thr Gin Gin Val Gly Val Asn Gly Glu 
1000 1005 1010 

gca ttc agt cag caa ctg gat tgg gag gtg cgt cag agt tac cgt gec 4 0560 
Ala Phe Ser Gin Gin Leu Asp Trp Glu Val Arg Gin Ser Tyr Arg Ala 
1015 1020 1025 

gat gee ccg cca ggt ggt ggt aat aac age gca ttg cac ttg gca tgg 40608 
Asp Ala Pro Pro Gly Gly Gly Asn Asn Ser Ala Leu Hie Leu Ala Trp 
1030 1035 1040 1045 

aat ggg gat tac ggc ctg tta ggt ggt gac tat age tac age egg gcg 40656 
Asn Gly Asp Tyr Gly Leu Leu Gly Gly Asp Tyr Ser Tyr Ser Arg Ala 
1050 1055 1060 

atg cgc cag atg gga gtc aat ate gcg gga ggt ata gtt ate cac cat 40704 
Met Arg Gin Met Gly Val Asn He Ala Gly Gly He Val lie His His 
1065 1070 1075 

cat ggt gtg acg ctg ggg caa cct ttg caa ggc tea gtg gcg ctg gtt 40752 
His Gly Val Thr Leu Gly Gin Pro Leu Gin Gly Ser Val Ala Leu Val 
1080 1085 1090 

gaa gcg cca ggg gee teg ggg gtg cca gtt ggc ggc tgg cct ggc gtt 4 0800 
Glu Ala Pro Gly Ala Ser Gly Val Pro Val Gly Gly Trp Pro Gly Val 
1095 1100 1105 

aag acg gat ttt cgt ggc gac acc aca gtg ggc aac ctg aac gtc tat 4 084 8 
Lys Thr Asp Phe Arg Gly Asp Thr Thr Val Gly Asn Leu Asn Val Tyr 
1110 1115 1120 1125 

cag gag aat aca gtc age etc gat ccg teg cga eta ccg gat gac gca 4 0896 
Gin Glu Asn Thr Val Ser Leu Asp Pro Ser Arg Leu Pro Asp Asp Ala 
1130 1135 1140 

gag gtc aca caa acc gat gtg cgc gtg gtg cca acc gaa ggg gcg gtg 40944 
Glu Val Thr Gin Thr Asp Val Arg Val Val Pro Thr Glu Gly Ala Val 
1145 1150 1155 
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gtg gaa gcg aag ttt cac act cgc ate ggg gec agg gca ctg atg acg 40992 
Val Glu Ala Lys Phe His Thr Arg lie Gly Ala Arg Ala Leu Met Thr 
1160 1165 1170 

ctg aaa egg gaa gat ggt age gee att cct ttc ggg gcg cag gtt aca 41040 
Leu Lys Arg Glu Asp Gly Ser Ala He Pro Phe Gly Ala Gin Val Thr 
1175 1180 1185 

gtc aat ggg cag gat ggc agt get get ctg gtg gat act gat age cag 41088 
Val Asn Gly Gin Asp Gly Ser Ala Ala Leu Val Asp Thr Asp Ser Gin 
1190 1195 1200 1205 

gtt tat etc act ggt ttg gcg gat aag ggc gaa ctg acg gtg aaa tgg 41136 
val Tyr Leu Thr Gly Leu Ala Asp Lys Gly Glu Leu Thr Val Lys Trp 
1210 1215 1220 

gga gca cag caa tgt egg gtt aac tac cgc eta cct gec cac aag gga 41184 
Gly Ala Gin Gin Cys Arg Val Asn Tyr Arg Leu Pro Ala His Lys Gly 
1225 1230 123S 

ate gcg ggc ttg tat caa atg age ggt etc tgc aga tag ccgattctga 4123 3 
He Ala Gly Leu Tyr Gin Met Ser Gly Leu Cys Arg 

1240 1245 1250 

aggagagaat a atg tgg atg aaa ata cag cga gtg aaa acg gtt ate tat 41283 
Met Trp Met Lys lie Gin Arg Val Lys Thr Val He Tyr 
1255 1260 

age gta age tta ctg gtc get gec agt age ttg gtg ccg ata gcg aac 41331 
Ser Val Ser Leu Leu Val Ala Ala Ser Ser Leu Val Pro He Ala Asn 
1265 1270 1275 

gee gca gaa aaa ctt cag aca acg eta cgt gta ggt act tac ttt egg 41379 
Ala Ala Glu Lys Leu Gin Thr Thr Leu Arg Val Gly Thr Tyr Phe Arg 
1280 1285 1290 1295 

get ggg cac gtg cca gat ggg atg gtg ctt gcg caa ggc tgg gtg act 41427 
Ala Gly His Val Pro Asp Gly Met Val Leu Ala Gin Gly Trp Val Thr 
1300 1305 1310 

tat cac ggc agt cac age ggg ttt egg gta tgg age gat gag caa aag 41475 
Tyr His Gly Ser His Ser Gly Phe Arg Val Trp Ser Asp Glu Gin Lys 
1315 1320 1325 

gcg ggt aac acg cct acc gta ttg ctg ctg age ggg caa cag gat cct 41523 
Ala Gly Asn Thr Pro Thr Val Leu Leu Leu Ser Gly Gin Gin Asp Pro 
1330 1335 1340 

cgc cat cac att cag gtt cgc ctg gag ggc gag ggg tgg caa cca gat 41571 
Arg His His He Gin Val Arg Leu Glu Gly Glu Gly Trp Gin Pro Asp 
1345 1350 1355 

acg gtg agt ggt cgt ggc gec att tta aga acc get gca gat aac gec 41619 
Thr Val Ser Gly Arg Gly Ala He Leu Arg Thr Ala Ala Asp Asn Ala 
1360 1365 1370 1375 

agt ttc agt gtg gtc gtt gat ggc aat cag gaa gtg cct gcg gac acc 41667 
Ser Phe Ser Val Val Val Asp Gly Asn Gin Glu Val Pro Ala Asp Thr 
1380 1365 1390 

tgg acg ctg gat ttt aag gec tgt gca ttg gcg cag gag gat acg tag 41715 




Trp Thr Leu Asp Phe Lys Ala Cys Ala Leu Ala Gin Glu Asp Thr 
1395 1400 1405 



ccgtctgttc 


cactcatact 


tcctgtatca 


gtgaataagg 


cgaatactgt 


atatcaggtt 


41775 


cgaggatgta 


ttgaaaattt 


actcaaggca 


aatatggtta 


gcaatcttca 


ttgcagaaaa 


41835 


atcaattgct 


gattgatttt 


aaaatctgaa 


attcatcttt 


tttgtaggga 


gaggaatatt 


41695 


atgtttcgta 


aaataattta 


acggacactg 


atgcctttat 


tatttctgtt 


ttgtggattt 


419S5 


gtttgattca 


ctcttttgtg 


agcggggctt 


tataagcccc 


tggtataagg 


tttgtttatc 


42015 


tgtcaggtat 


tcatatgtga 


tatttaagat 


tttatgttta 


gggggctttt 


tcttgtcccc 


42075 


tcaggcgtaa 


aaataattta 


ttttttacat 


aaaggaataa 


agcatatgtc 


ttatgcacga 


42135 


catttaccgg 


tattaatgta 


tcaccatgtc 


agtgataaac 


ccggacagat 


aaccttatct 


42195 


ccccgtacgt 


tccgggcgca 


gatgaaatgg 


ctggccgaat 


ctggctggaa 


aaccgttact 


42255 


gcctgcagag 


gtggaagctt 


tttatcatgg 


tgcaagattg 


cctcgtaaaa 


gcgtcatgct 


42315 


gacctttgat 


ggcggctggc 


tggataactg 


gttgcaggtt 


tttccggtgc 


tgcaggagtt 


42375 


taatctgcat 


gcgcatctct 


ttcttgtgac 


cagtttgatc 


agtgacggae 


cggtccgtat 


42435 


tcctgcaggc 


gaaccggtgt 


actctcatga 


tgagtgtcaa 


atgctggtta 


aacaaggccg 


42495 


ggctgatgag 


gtcatgctgc 


gctggtcaga 


ggtccgggag 


atgcacctca 


gtggccttgt 


42555 


tgagtttcac 


tcgcacacgc 


acacccaccg 


acgctgggac 


cagaagcctg 


tgtcccgtaa 


42615 


tccgtcggat 


ttgcttcgtg 


tcgatattct 


tcttagtcgt 


aagcggatga 


gggagatgct 


42675 


gggttattgc 


agtcagcatc 


tgtgctggcc 


tgagggctgg 


tattgttctg 


actatattca 


42735 


tgtggctgaa 


gagttggggt 


tcacatacct 


gtataccaca 


gaaaggcgta 


tgaacaatcc 


42795 


agtcatcggt 


tcacagcgta 


ttggtcgtat 


caacgcaaag 


gagcgaaaga 


atgtgggctg 


42855 


gctgaaacgt 


cgtctgtttt 


atcacaccac 


gcccggattt 


tcttcgctgc 


tggcccggca 


42915 


taagggggca 


cgtcggatag 


ctgactgagc 


cagagaccag 


gatgaaagtg 


ctgcataccg 


42975 


aatcctcccc 


catcatcggc 


gggcaggggt 


tgtaggctat 


atcccaaatg 


atggtgctga 


43035 


ttt tgagcag 


aaataagata 


acaccggagg 


ccgtgataga 


cccggttgcg 


cagaatctgg 


43095 


aggcagcagg 


tcactggcgc 


agggcttcta 


cccgctggct 


tttggttatg 


ggagattttg 


43155 


aatgtacaga 


ggcccagcgg 


gagtggttgt 


tgttgcgccg 


gaattattgc 


cttgcgcaga 


43215 


tatcttcccc 


tggcgccagt 


aaagctggat 


atcagcgacg 


tgacgaaatg 


agaggagcct 


43275 


gcacat-aggt 


ttgtgttttg 


agtaagtgtc 


ttatttaaac 


cgtctgttct 


gtttcctccg 


43335 


ctttcacaaa 


taatgtcgag 


ccgggtgggg 


gactcaagta 


agaataatct 


ggcgatgttt 


43395 


tgcttgtttc 


cacgggatac 


tttgttaggt 


gaacgataca 


attaatgcgc 


tccagactgc 


43455 




gtcacagaaa 


aggagaatgc 


ctttgcagaa 


cgggcgcatg 


acaatgaacc 


actttattat 


43515 


tggtatctgg 


ttcggcgaca 


accttgacga 


tcacgtctga 


gatagggtat 


ttacactgag 


43575 


tggtaaacag 


gttcaattag 


taaccggaga 


tggatgcaaa 


atcatgatcg 


attcagatgc 


43 635 


tattttgcag 


ccaatagatt 


ttttattaag 


atgatatcaa 


ggacattgag 


gcacacaatg 


A *3 C O C 


acgtatcagt 


aagtcgttga 


tagctcattt 


gatataagaa 


tttcttttat 


caacggaaga 


is nee 


taatgatgga 


actgatcaat 


aatcgtggta 


tgcgagactg 


gatgatattt 


attaaagtgg 


43815 


cggaagtagg 


gaatctttcc 


cgggctgcgc 


gggaattaga 


tattagcatt 


tctgctgtca 


43 875 


gtaaatcgct 


tagtcgcctt 


gagaattcta 


ttgaggttac 


tttacttcgg 


cgcgattcac 


43935 


atcacttaga 


actgactgga 


gctggtcaga 


cagcctatgc 


aagcatgaaa 


aggataacat 


A ^ O Q C 

43995 


cttcctttca 


gtccttgctg 


gatgaattgc 


gaaatccgga 


taaaattatc 


agagggagta 


44055 


taaaattttc 


ggctccggct 


attgtctgtg 


agtttcttgc 


caataagtgg 


atatgggaat 


44115 


ttacagctag 


ctatccggat 


acaaaaatct 


acctggattc 


acgagagcgt 


agcgattttt 


44 175 


ttagcaaacc 


cctggagttt 


gatgagctgg 


ttttcaaaag 


cggcataatc 


g a a sg t g agg 


442 3 3 


atctcgtgta 


tcgaaagata 


agccctttaa 


agttggttct 


ttgtgcgagt 


ccgaaatata 


44295 


tcagaaaata 


tggcaggatc 


tcacaccctg 


gcgatttgga 


aaatcacatt 


attgtgggtc 


44355 


ttcacaacca 


tggtctttcc 


ggacctctta 


ctcttttccg 


tcaggatgaa 


tcatacacta 


44415 


ttagtggcgc 


tgttaatgtt 


catttatctt 


ccaataatct 


tttgagtgtt 


cttaatttgg 


444 75 


ttttagaagg 


aaagggtatc 


aacctcatga 


ctccggcctg 


gcttgccacc 


aaatacttaa 


44535 


aaaataatga 


acttgaaatt 


atacttcctg 


aatggagggt 


tccagatctc 


cccatttatc 


44595 


ttgtatggcg 


tcatcgtcag 


tattattctc 


ctttatttca 


acgctttctg 


tcttttattg 


44655 


aagataaatg 


gaataatcgc 


ccacaaattg 


attttctgaa 


tgatgattaa 


cccgtttgga 


44715 


atggttttga 


tacgttcctg 


acttaaaccc 


acatgatgac 


tgaattgagg 


catcgagata 


44775 


tgcgactggt 


cagccagtcg 


tcttttgacg 


atgcccaata 


caaaacatga 


tgccgactga 


44835 


cggaaatgat 


aatacgcgga 


aacaggacgg 


ggctgttttt 


gggcagccgg 


aagttaagcc 


44895 


cataccagaa 


acgttgcagt 


gtactgaaaa 


atggcgccag 


gttgcacctg 


ttcaaagatt 


44955 


ttctgaaggc 


gcaggagtat 


tcattactga 


tatcctccat 


tgcgccttcg 


ggaacccaca 


45015 


ggaccagcta 


ttttaccgat 


agtgtttaaa 


aggcgtaagt 


aatgccgagc 


atgaagtcat 


45075 


tggaggcagc 


ctttgtgtct 


gcatcataag 


cggtatgttc 


atcaccagca 


tagtgatttt 


45135 


ttgaaatgct 


tactttgcca 


gcattaatgt 


atttataact 


ggcgtcaatc 


ataatattat 


45195 


ctgttacagc 


atattttgca 


ccgatacctg 


cgccccaggc 


aaagttattt 


tttgaagcag 


4S2S5 


acagagtttc 


attaatacca 


aaaccaacag 


gaatggtgtt 


attacttagc 


ttcacatgag 


45315 




# 
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# 
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cgaggccaac 


gcctgcgctg 


atatagggag 


taaatgccgt 


actattgtga 


aaatcataat 


45375 


agccattaac 


catgtaagtg 


gtcattcgga 


cctgatcttt 


tacatttatg 


tgtactggat 


4S435 


caccaaatgc 


aataatatcc 


tgcccgcctt 


tagcatccgt 


ctcacctctg 


aaagtggtat 


45495 


ccagttctaa 


acgtactgga 


agctggaatg 


gatcataaaa 


gtcataaccg 


atagcaaccc 


45555 


cgccgccaaa 


aacgcctttg 


gtacggtcag 


gtaacgttgc 


atgaccatta 


actatctcat 


45615 


cctggctgaa 


ggttgagttg 


attccataga 


cattgactac 


ggatgtcccc 


gctttcccgg 


4S675 


tgatatagat 


cccttctttt 


gctgatgcag 


tagcggacca 


ggctaccaca 


aggggaatga 


4573S 


tgcagactgc 


gaaaaagttt 


ttcatttcag 


aacctgcctt 


aatattgggc 


taaaagacaa 


45795 


gtttcacggt 


atagggtgtg 


atataacgat 


tacataaacg 


aagcccaaaa 


aacggtctat 


45855 


tgtaacgctg 


ggttttctgt 


aagcgggtaa 


aaaatgagat 


gaagatttta 


aataacaata 


45915 


cgataatcgt 


cggtatggaa 


atccatctcc 


tcgccaaatt 


gccccacgta 


cggtttcact 


45975 


tctacgttat 


gtaacgggta 


gtgtgagatg 


gagcgatgct 


gtaagaaaaa 


gatgaagatg 


46035 


aatttgtacc 


cgacctggat 


aaagcccgtt 


atcccggaat 


aacgggcaaa 


aatatttact 


46095 


caagtgcctg 


ggcgagatct 


tgttgtacct 


gttgacgctg 


ttctggtgtt 


aagactttgc 


46155 


ttaaatcaaa 


ataatattta 


acccgataat 


agcgagcctg 


ttgttctatg 


ttactgaagg 


46215 


ctgcaagctg 


ctgttttacg 


gcggcgtcat 


cccatttacc 


ggatttaatc 


acctctatca 


46275 


gcgcaccgtc 


tttaattccc 


ttcatagaaa 


tctgactgac 


gtcggtttcc 


agttgttggt 


46335 


gaagtttttt 


gatccgggta 


atctgatcgt 


ttgtcagctt 


cagatgctgg 


acaataggat 


46395 


cctgggcggg 


caggggagga 


ttggggacag 


cggtggcgaa 


agcgccaaaa 


gaaacgcccg 


46455 


ccagagtcgc 


tgccagtaaa 


gttgtgcgta 


caaagttttt 


catgaagata 


tcctgataag 


46515 


ggagtgatta 


acegttttta 


ttacccacga 


atggcgagca 


attatcttag 


agcctatccc 


46575 


agtagggcta 


ttttacttgc 


cattttggac 


ctgggcagtg 


ctcgccaaaa 


cgcgttagcg 


46635 


ttttgaacgc 


cgctagcggc 


ggcccgaagg 


gcgagcgtag 


cgagtcaaac 


ctcacgtact 


46695 


acgtgtacgc 


tccggttttt 


gcgcgctgtc 


cgtgtccaaa 


ctggctgcgc 


caataacgcc 


46755 


tggtgggata 


ggctcttagt 


cagaatacgt 


tgcctgccac 


attacgccac 


gcgaatttgt 


46815 


tttacggaga 


gttacggagt 


gaaacaatcc 


cgccgcggtg 


agcggcaggt 


tgctt 


46870 



<210> 2 

<211> 166 

<212> PRT 

<213> Salmonella typhimurium 



<400> 2 

Met Lys Ser He Lys Lys Leu lie He Ala Ser Ala Leu Ser Met Met 



44 

10 15 




Ala Ala Ser Cys Tyr Ala Gly Ser Phe Leu Pro Asn Ser Glu Gin Gin 
20 25 30 

Lys Ser Val Asp lie Val Phe Ser Ser Pro Gin Asp Leu Thr Val Ser 
35 40 45 

Leu He Pro Val Ser Gly Leu Lys Ala Gly Lys Asn Ala Pro Ser Ala 
50 55 60 

Lys He Ala Lys Leu Val Val Asn Ser Thr Thr Leu Lys Glu Phe Gly 
65 70 75 80 

Val Arg Gly He Ser Asn Asn Val Val Asp Ser Thr Gly Thr Ala Trp 
85 90 95 

Arg Val Ala Gly Lys Asn Thr Gly Lys Glu He Gly Val Gly Leu Ser 
100 105 110 

Ser Asp Ser Leu Arg Arg Ser Asp Ser Thr Glu Lys Trp Asn Gly Val 
115 120 125 

Asn Trp Met Thr Phe Asn Ser Asn Asp Thr Leu Asp He Val Leu Thr 

i -a rt 13 5 14 0 

Gly Pro Ala Gin Asn Val Thr Ala Asp Thr Tyr Pro He Thr Leu Asp 
145 150 155 160 

Val Val Gly Tyr Gin Pro 
165 



<210> 3 
<211> 245 
<212> PRT 

<213> Salmonella typhimurium 
<400> 3 

Met Lys He Val Asn Phe Ala Val Met Ala Val Ala Leu Phe Ala Thr 
15 10 15 

Asn Ser Met Val Ser Val Tyr Ala Val Asn Gin Gin Leu Asn Ser Ala 
20 25 30 

Thr Lys Leu Phe Ser Val Lys Leu Gly Ala Thr Arg Val He Tyr His 
35 40 45 

Ala Gly Thr Ala Gly Ala Thr Leu Ser Val Ser Asn Pro Gin Asn Tyr 
50 55 60 

Pro He Leu Val Gin Ser Ser Val Lys Ala Ala Asp Lys Ser Ser Pro 
65 70 75 80 

Ala Pro Phe Leu Val Met Pro Pro Leu Phe Arg Leu Glu Ala Asn Gin 
85 90 95 

Gin Ser Gin Leu Arg He Val Arg Thr Gly Gly Asp Met Pro Thr Asp 
100 105 110 

Arg Glu Thr Leu Gin Trp Val Cys He Lys Ala Val Pro Pro Glu Asn 



mm: 



# 



45 



115 



120 



125 



Glu Pro Ser Asp Thr Gin Ala Lys Gly Ala Ttar Leu Asp Leu Asn Leu 
130 135 140 

Ser He Asn Ala Cys Asp Lys Leu He Phe Arg Pro Asp Ala Val Lys 
145 150 155 160 

Gly Thr Pro Glu Asp Val Ala Gly Asn Leu Arg Trp Val Glu Thr Gly 
165 170 175 

Asn Lys Leu Lys Val Glu Asn Pro Thr Pro Phe Tyr Met Asn Leu Ala 
180 185 190 

Ser Val Thr Val Gly Gly Lys Pro He Thr Gly Leu Glu Tyr Val Pro 
195 200 205 

Pro Phe Ala Asp Lys Thr Leu Asn Met Pro Gly Ser Ala His Gly Asp 
210 215 220 

He Glu Trp Arg Val He Thr Asp Phe Gly Gly Glu Ser His Pro Phe 
225 230 235 240 



His Tyr Val Leu Lys 
245 



<210> 4 
<211> 836 
<212> PRT 

<213> Salmonella typhitnurium 
<400> 4 

Met Lys Phe Lys Gin Pro Ala Leu Leu Leu Phe He Ala Gly val val 
15 10 15 

His Cys Ala Asn Ala His Thr Tyr Thr Phe Asp Ala Ser Met Leu Gly 
20 25 30 

Asp Ala Ala Lys Gly Val Asp Met Ser Leu Phe Asn Gin Gly Leu Gin 
35 40 45 

Gin Pro Gly Thr Tyr Arg Val Asp Val Met Val Asn Gly Lys Arg Val 
50 55 60 

Asp Thr Arg Asp Val Val Phe Lys Leu Glu Lys Asp Gly Gin Gly Thr 
65 70 75 80 

Pro Val Leu Ala Pro Cys Leu Thr Val Ser Gin Leu Ser Arg Tyr Gly 
85 90 95 

Val Lys Thr Glu Asp Tyr Pro Gin Leu Trp Lys Ala Ala Lys Pro Pro 
100 105 110 

Asp Glu Cys Ala Asp Leu Thr Ala He Pro Gin Ala Lys Ala Val Leu 
115 120 125 

Asp He Asn Asn Gin Gin Leu Gin Leu Ser He Pro Gin Leu Ala Leu 
130 135 140 



Arg Pro Glu Phe Lys Gly He Ala Pro Glu Asp Leu Trp Asp Asp Gly 



145 

He Pro Ala Phe 

Tyr Lys Met Asp 
180 

Gin Pro Gly He 
195 



% 

150 

Leu Met Asn Tyr 
165 

Met Val Gly Arg 

Asn lie Gly Ala 
200 



46 

155 

Ser Ala Arg Thr 

170 

Asp Asn Ser Ser 
185 

Trp Arg val Arg 




160 



Thr Gin Thr Asp 

17S 

Trp Val Gin Leu 
190 

Asn Ala Thr Ser 
205 



Trp Gin Arg Ser Ser Gin Leu Ser Gly Lys Trp Gin Ala Ala Tyr Thr 
210 215 220 

Tyr Ala Glu Arg Gly Leu Tyr Ser Leu Lys Ser Arg Leu Thr Leu Gly 
225 230 235 240 

Gin Lys Thr Ser Gin Gly Glu He Phe Asp Ser Val Pro Phe Thr Gly 
24S 250 255 

Val Met Leu Ala Ser Asp Asp Asn Met Val Pro Tyr Ser Glu Arg Gin 
260 265 270 

Phe Ala Pro Val Val Arg Gly He Ala Arg Thr Gin Ala Arg Val Glu 
275 280 285 

Val Lys Gin Asn Gly Tyr Thr He Tyr Asn Thr Thr Val Ala Pro Gly 
290 295 300 

Pro Phe Ala Leu Arg Asp Leu Ser Val Thr Asp Ser Ser Gly Asp Leu 
305 310 315 320 

His Val Thr Val Trp Glu Ala Asp Gly Ser Thr Gin Met Phe Val Val 
325 330 335 

Pro Tyr Gin Thr Pro Ala He Ala Leu His Gin Gly Tyr Leu Lys Tyr 
340 345 350 

Ser Leu Leu Ala Gly Arg Tyr Arg Ser Ser Asp Ser Ala Thr Asp Lys 
355 360 365 

Arg Gin He Ala Gin Ala Thr Leu Met Tyr Gly Leu Pro Trp Asn Leu 
370 375 380 

Thr Ala Tyr Gly Gly He Gin Ser Ala Thr His Asn Gin Ala Ala Leu 
385 390 395 400 

Leu Gly Leu Gly Gly Ser Leu Gly Arg Trp Gly Ser Leu Ser Val Asp 
405 410 415 

Gly Ser Asp Thr His Ser Gin Arg Gin Gly Glu Ala Val Gin Gin Gly 
420 425 430 

Ala Ser Trp Arg Leu Arg Tyr Ser Asn Gin Leu Thr Ala Thr Gly Thr 
435 440 445 

Asn Phe Phe Leu Thr Arg Trp Gin Tyr Ala Ser Gin Gly Tyr Asn Thr 
450 455 460 



Leu Ser Asp Val Leu Asp Ser Tyr Arg His Asn Gly Asn Arg Leu Trp 
465 470 475 480 



# 
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Ser Trp Arg Glu Asn Leu Gin Pro Ser Ser Arg Thr Thr Leu Met Leu 
485 490 495 

Ser Gin Ser Trp Gly Arg His Leu Gly Asn Leu Ser Leu Thr Gly Ser 
500 505 S10 

Arg Thr Asp Trp Arg Asn Arg Pro Gly His Asp Asp Ser Tyr Gly Leu 
515 520 525 

Ser Trp Gly Thr Ser He Gly Gly Gly Ser Leu Ser Leu Asn Trp Asn 
530 535 540 

Gin Asn Arg Thr Leu Trp Arg Asn Gly Ala His Arg Lys Glu Asn He 
545 550 555 560 

Thr Ser Leu Trp Phe Ser Met Pro Leu Ser Arg Trp Thr Gly Asn Asn 
565 570 575 

Val Ser Ala Ser Trp Gin Met Thr Ser Pro Ser His Gly Gly Gin Thr 
580 585 590 

Gin Gin Val Gly Val Asn Gly Glu Ala Phe Ser Gin Gin Leu Asp Trp 
595 600 605 

Glu Val Arg Gin Ser Tyr Arg Ala ABp Ala Pro Pro Gly Gly Gly Asn 
610 615 620 

Asn Ser Ala Leu Hie Leu Ala Trp Asn Gly Asp Tyr Gly Leu Leu Gly 
625 630 635 640 

Gly Asp Tyr Ser Tyr Ser Arg Ala Met Arg Gin Met Gly Val Asn He 
645 650 655 

Ala Gly Gly He Val He His His His Gly Val Thr Leu Gly Gin Pro 
660 665 670 

Leu Gin Gly Ser Val Ala Leu Val Glu Ala Pro Gly Ala Ser Gly Val 
675 680 685 

Pro Val Gly Gly Trp Pro Gly Val Lys Thr Asp Phe Arg Gly Asp Thr 
690 695 700 

Thr Val Gly Asn Leu Asn Val Tyr Gin Glu Asn Thr Val Ser Leu Asp 
705 710 715 720 

Pro Ser Arg Leu Pro Asp Asp Ala Glu Val Thr Gin Thr Asp Val Arg 
725 730 735 

Val Val Pro Thr Glu Gly Ala Val Val Glu Ala Lys Phe His Thr Arg 
740 745 750 

He Gly Ala Arg Ala Leu Met Thr Leu Lys Arg Glu Asp Gly Ser Ala 
755 760 765 

He Pro Phe Gly Ala Gin Val Thr Val Aen Gly Gin Asp Gly Ser Ala 
770 775 780 

Ala Leu Val Asp Thr Asp Ser Gin Val Tyr Leu Thr Gly Leu Ala Asp 

785 790 795 800 



% 
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Lys Gly Glu Leu Thr Val Lys Trp Gly Ala Gin Gin Cys Arg Val Asn 
605 810 815 

Tyr Arg Leu Pro Ala His Lys Gly lie Ala Gly Leu Tyr Gin Met Ser 
820 825 830 

Gly Leu Cys Arg 
835 



<210> 5 
<211> 156 
<212> PRT 

<213> Salmonella typhimurium 
<400> 5 

Met Trp Met Lys lie Gin Arg Val Lys Thr Val He Tyr Ser Val Ser 
15 10 15 

Leu Leu Val Ala Ala Ser Ser Leu Val Pro He Ala Asn Ala Ala Glu 
20 25 30 

Lys Leu Gin Thr Thr Leu Arg Val Gly Thr Tyr Phe Arg Ala Gly His 
35 40 45 

Val Pro Asp Gly Met Val Leu Ala Gin Gly Trp Val Thr Tyr His Gly 
50 55 60 

Ser His Ser Gly Phe Arg Val Trp ser Asp Glu Gin Lys Ala Gly Asn 
65 70 75 80 

Thr Pro Thr Val Leu Leu Leu Ser Gly Gin Gin Asp Pro Arg His His 
85 90 95 

He Gin Val Arg Leu Glu Gly Glu Gly Trp Gin Pro Asp Thr Val Ser 
100 105 110 

Gly Arg Gly Ala He Leu Arg Thr Ala Ala Asp Asn Ala Ser Phe Ser 
11S 120 125 

Val Val Val Asp Gly Asn Gin Glu Val Pro Ala Asp Thr Trp Thr Leu 
130 135 140 



Asp Phe Lys Ala Cys Ala Leu Ala Gin Glu Asp Thr 
145 150 155 
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SEQUENCE LISTING NO- 2 

<110> Folkesson, Anders 

<120> The corrqplete sequence of the tcf insert of Salmonella 
enterica serovar Typhi. 

<130> The tcf insert in Salmonella typhi 

<140> 
<141> 

<160> 6 

<170> Patentln Ver. 2.1 

<210> 1 
<211> 92S3 
<212> DNA 

<213> Salmonella typhi 

<220> 
<=221> CDS 

<222> (1898) . . (2608) 

<223> tc£A putative fimbrial subunit 

<220> 
<221> CDS 

<222> (2659) . . (3234) 

<223> tcfB putative fimbrial subunit 

<220> 
<221> CDS 

<222> <3360) . . (6029) 

<223> tcfC putative fimbrial subunit 

<220> 
<221> CDS 

<222> (60S2J . . (7131) 

<223> tcfD putative fimbrial subunit 

<220> 
<22l> CDS 

<222> (7264) . ; (7719) 

<223> tinR putative transcriptional regulator 
*400> 1 

tgtaagtatc ccgcataatc gagccattca catttagaga tcatccggca taatcaatct 60 
gccaacgcag gagatcgctg tgcgtaaagc ccgtattact gcgcaccaga tcatcrgctgt 120 
gattagatca gttgaatccg gacggactgt taaagatgtc taccgggogg ccggtatttc 180 
tgaagccacc agggacaact ggaagtctgg atacggcggc atggaagctt ctgatattaa 240 
atcttgagga tgtcaacgcc aggatttatg gtcgtttttc gctattttta atatccgctg 300 
tttgatcact tctgctgtcc gctttccgcc atttcatctt cactgattgg cgttgcgctt 360 
ttggtcagcg ccccgacttt gcgtttttcc ttcagttggt aactttctcc tttgatattc 420 




agtgtggttq 


agtgatgcag 


tagccgatcc 


aggategctg 


ttgccagcac 


gt tatcaccg 


460 


aacatctctc 


cccagtctgc 


oaaacct ttcr 


tttgaegtea 


aaataatoct 


cgcttt ttca 


540 


taccgtcggt 


tcagcagtcg 


gaagaacaga 


ctggcttcct 


cactggtcat 


toacaoataa 


SOO 


cctatttcgt 


ccaggatcag 


cccccgcgca 


tagctcagtt 


QttCttaQCtQ 

^^^^^^ 


Qcqctccaqci 


660 


cggttt tcca 


get tegett t 


catcagcgtc 


taccaacctg 


tcctacooca 

»■ » » » j «3 w** 


tgaacaacac 


720 


ccoataaccc 


acatctocca 


ctttcacacc 


gggggcagcg 


accaoataaa 


ttttctccac 


780 




ccaacaacrat 


cacattctcg 


cagcgctcca 


cgaacgccag 


accaaccaac 


840 


t cc cggacga 


cct tacgatc 


gatgectgge 


tggaagctga 


aatcaaacta 


ctccagcgtt 


900 






ccgtttcagc 


egggattcca 


h f rrororf a 


ci ty t w *y^^-y 


960 




yt ty tciyt- y t 


catgeacagg 


aatgtcgact 


t~ A f" t" t" dClfTf 

lulll i-yyy t 


ty at-ay ctyy^ 


1020 




□a^av t ty t d 


tattcatgtt gcttactcgc 


ci t <t a a <x y y y y 


y aaavj ^aa^t> 


1080 




l_ LUdCaUUl.fi 


agggactggc 


atcaat t tga 






t -1 M. it 






tattttcatg 


gtgagttgtt 




t ty t ty 1 1 ty 


1200 


tooaatattr 


aactaatatc 


taatctaaaa 


atagtgtatt 


atattoatta 


cactttocta 


1260 


y & 3 Q y y t*~3 ^c* 




gtaaccatat 


gatgtatata 


AGtttttOtt 


tact aat ate 


1320 






gcaagtaaat 


ctgatctaac 




at ttfcct aaa 


1380 


htaattacft 


tact t~ act" t t 


tttttacceg 


gttttgtaaa 


a^uLW ci w 




1440 


t" ttatDOCtt 


■ho t t* aaaoh 


ttatttggtt 


ttgtagtctg 


a At tat" atct 


ccctrctGtsa 

^ W ^ *-« W J J 


1500 


yyy i. u n» u 




gtctcctacg 


ttgttatgtt 


ai^y tat u ty t 


ty ttt ty ctciy 


1560 


gagggggaaa 


cacagttcca 


tttatctgag 


taagtcaggt 


acacagtaac 


aactttctta 


1620 


tgaagaattt 


ccaaaatttt 


tactgeggeg 


ttattaattg 


ttcagegatt 


cttacagatc 


1680 


tgtcgttcgc 


ttttggtgaa 


tgaaatccgt 


ggacttttat 


ttactaattt 


tttctttcct 


1740 


gaaaaaaaca 


gaggtattga 


gcgaaaaatt 


ttattccgta 


tgatgccctc 


cacacaaaat 


1800 


gtattaacac 


tgaatcgtaa 


tttgettett 


tatgetgata 


actttctgtc 


tatgetaata 


I860 


ctaaaattta 


gatgactttt 


ataeggtaaa 


atctggt atg aat ttt aaa gat act 


1915 



Met Asn Phe Lys Asp Thr 
1 S 



ctt ccc ggg gtg ttt etc tgt gtc get atg ttt gca tgt ggt cat gec 1963 
Leu Pro Gly Val Phe Leu Cy8 Val Ala Met Phe Ala Cya Gly His Ala 
10 15 20 

agg gcg aat atg etc gtt tat ccc atg gcg gca gaa att aat agt age 2011 
Arg Ala Asn Met Leu Val Tyf Pro Met Ala Ala Glu lie Asn Ser Ser 
25 30 35 
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cgc 
Arg 


gaa 

Glu 
40 


gag 

Glu 


gec 
Ala 


acc 
Thr 


teg 
Ser 


ctg 
Leu 
45 


ttc gtc 
Phe val 


tat tct 
Tyr Ser 


aaa 
Lys 
50 


tea gat cat 
Ser Asp His 


gtg 

Val 


2059 


caa 
Gin 
55 


tat 
Tyr 


att cga 
lie Arg 


aca 
Thr 


aga 
Arg 
60 


ate 
He 


atg 
Met 


cgt 
Arg 


att 
He 


gaa 
Glu 
65 


cac 
His 


ccc ggt atg 
Pro Gly Met 


cca 
Pro 
70 


2107 


cag 
Gin 


gag 
Glu 


aag 
Lys 


gag 

GlU 


gta 
val 
75 


cca 
Pro 


gca 
Ala 


ggg aat gat ata 
Gly Asn Asp He 
80 


gag 

Glu 


aca gga ctt 
Thr Gly Leu 
85 


gtt 
Val 


2155 


QtC 

Val 


tec 
Ser 


ccg 
Pro 


gag 
Glu 
90 


aaa 

Lye 


ttt 
Phe 


get 
Ala 


ctt 
Leu 


tec ccg gga 
Ser Pro Gly 
95 


aca 
Thr 


aaa aaa aca 
Lys Lys Thr 
100 


ata 
He 


2203 


cgc 
Arg 


gtt 
Val 


ate 
He 
105 


agt 
Ser 


act 
Thr 


cag 
Gin 


gca 
Ala 


ccg 
Pro 
110 


gaa aga gag 
Glu Arg Glu 


gaa 
Glu 


gec tgg egg 
Ala Trp Arg 
115 


gta 
Val 


2251 


tac 
Tyr 


ttc gag get 
Phe Glu Ala 
120 


att 
Val 


cct 
Pro 


aaa 
Glu 
125 


ctg 
Leu 


gaa gat gat 
Glu Asp Asp 


cca 
Pro 
130 


caa aca aac 
Gin Ala Gly 


gga 
Gly 


2299 


aag 
hys 
135 


caa aat 
Gin Asn 


tea 
Ser 


tec 
Ser 


ata 
Val 
140 


aat 
Ser 


gtg 

Val 


aat 
Asn 


ctt 
Leu 


gtc 
val 

145 


'■as 

Trp 


qoo ejta tta 
Gly Val Leu 


ctg 
Leu 
150 


2347 


cot 
Arg 


gtt 
val 


tct 
Ser 


ccg 
Pro 


tea 
Ser 
155 


aac 
Asp 


ccc 
Pro 


agg 
Arg 


cct 
Pro 


gcg 
Ala 
160 


ctg 
Leu 


val 


acg gac ggt 
Thr Asp Gly 
165 


cac 
Hie 


2395 


cac 
His 


ctg 
Leu 


ctg 
Leu 


aat 
Asn 
170 


a eg 
Thr 


gga 

Gly 


aac 
Asn 


aca egg 
Thr Arg 

175 


ctt 
Leu 


tct 
Ser 


ctt 
Leu 


att egg get 
lie Arg Ala 
180 


ggc 
Gly 


2443 


aac 
Asn 


tgc 
Cys 


gac 
Asp 
185 


acc 
Thr 


aca 
Thr 


tgc 
Cys 


cac 
His 


tgg 
Trp 
190 


cag 
Gin 


aat 
Asn 


ata 
He 


ggc 
Gly 


aaa agt att 
Lys Ser He 
195 


tat 
Tyr 


2491 


ccc 
Pro 


ggc ggg agt 
Gly Gly Ser 
200 


get 
Ala 


gat 

Asp 


att 
He 
205 


ccg gec gga 
Pro Ala Gly 


ata 
He 


aaa 
Lya 

210 


agt aat gca 
Ser Asn Ala 


ttt 
Phe 


2539 


cgt 
Arg 
215 


gtg 
val 


gaa tat 
Glu Tyr 


cgt 
Arg 


acg 
Thr 
220 


ggt 
Gly 


gca 
Ala 


aat 
Asn 


tea 
Ser 


ccg 
Pro 
225 


gta 
Val 


ate tct get 
He Ser Ala 


gat 
Asp 
230 


2587 



tta aca gca gec gga aag taa aaacacaegg agegtacget ataccctaca 263 8 

Leu Thr Ala Ala Gly Lys 
235 

tttattctca gggggagegg atg tat acc gag tgt aca tat ate act gta ata 2691 

Met Tyr Thr Glu Cys Thr Tyr He Thr Val He 

240 245 



aac aac aaa gca agg tta ttt ttt atg aac atg aaa aca tct ttt att 2739 
Asn Asn Lys Ala Arg Leu Phe Phe Met Asn Met Lys Thr Ser Ph© He 




250 255 

gcc gca get gtg gca ttg gec acc gtt 
Ala Ala Ala Val Ala Leu Ala Thr Val 
265 270 

gtt cag aag gat att acc gtc act gcc 
Val Gin Lys Asp lie Thr val Thr Ala 
285 

ctg ctg cag gcc gat ggt tea tec etc 
Leu Leu Gin Ala Asp Gly Ser Ser Leu 
300 305 



m 

52 



260 

tat tct ttt tct gtt tct gcg 2787 

Tyr Ser Phe Ser Val Ser Ala 
275 280 

aat att gac agt aca ctt gaa 2835 

Asn lie Asp Ser Thr Leu Glu 

290 295 

ccg teg act atg aag ctg gat 2883 

Pro Ser Thr Met Lye Leu Asp 
310 



ttc atg ccg ggt aag ggc 
Phe Met Pro Gly Lys Gly 
315 

tac age aac gat cag acc 
Tyr Ser Asn Asp Gin Thr 
330 

cca caa ctt ate aac gtc 

345 350 

gtg act ctg gga gga egg 
Val Thr Leu Gly Gly Arg 
365 

get aaa acc ctg ttc ccg 
Ala Lys Thr Leu Phe Pro 
380 

ctg aac ctg gat att ggt 
Leu Asn Leu Asp lie Gly 
395 

cct gcc ggt gaa tac age 
Pro Ala Gly Glu Tyr Ser 
410 



ctg gtc cat aaa tea etc 
Leu Val His Lys Ser Leu 
320 

aag teg gtt aat gta aaa 
Lys Ser Val Asn Val Lys 
335 340 

ctg gat ccc acc aaa acc 

Le~u Asp Pro Thr Lys Thr 
355 

tea ctg acc acc acc aat 
Ser Leu Thr Thr Thr Asn 
370 

gac gga aaa act ggc gat 
Asp Gly Lys Thr Gly Asp 
385 

cag aag get gga gca gcc 
Gin Lys Ala Gly Ala Ala 
400 

gga ttg gtc agt ctg gtg 
Gly Leu Val Ser Leu Val 
415 420 



cag acc cgc ctt 2931 

Gin Thr Arg Leu 

325 

ctg ttg aat get 2979 
Leu Leu Asn Ala 

att gat atg gaa 3027 
lie Asp Met Gl« 
360 

tct gta ctg gaa 3075 
Ser Val Leu Glu 
375 

get tea get ctg 3123 
Ala Ser Ala Leu 
390 

tta caa aac ctg 3171 

Leu Gin Asn Leu 

405 

att tea cag get 3219 
lie Ser Gin Ala 



gtc act gcc ggc taa taactggtta ttagctcttc atctgatccg gttttggggg 3274 
val Thr Ala Gly 

425 

gcaccgttcg tacctgaacc ggatccggta ttgatcttat tattcattgc aattcaggtc 3334 

tetttaegtg agtcgttatt tctgg atg tat tat tta ctg gga ttg tgc agt 3386 

Met Tyr Tyr Leu Leu Gly Leu Cys Ser 
430 435 

ttt acc age cag gca act ctt att ccc cct cct gga ttt gaa tct ctg 3434 
Phe Thr Ser Gin Ala Thr Leu lie Pro Pro Pro Gly Phe Glu Ser Leu 
440 445 450 



ctg gaa gga cag act gag caa att gaa gtg ttg eta cca ggg cat tea 3482 
Leu Glu Gly Gin Thr Glu Gin lie Glu Val Leu Leu Pro Gly His Ser 
455 460 465 470 



# § 
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ctg gga tta ttt ccg gtg gtg gtt aaa ccg gac acc gtg cag ttc atg 353 0 
Leu Gly Leu Phe Pro Val Val Val Lys Pro Asp Thr Val Gin Phe Met 
475 460 485 

tec cca ttg atg gta ctt gaa age agt ggg ctt gec gcg ttg ccg gec 3 578 
Ser Pro Leu Met Val Leu Glu Ser Ser Gly Leu Ala Ala Leu Pro Ala 
490 495 500 

gca gaa egg caa aaa gcg ctg get gca etc age cgt ccg ttg eta cgt 3 626 
Ala Glu Arg Gin Lye Ala Leu Ala Ala Leu Ser Arg Pro Leu Leu Arg 
505 510 515 

aac age aat ctg gtc tgt ggt gte tea gaa gca aaa gac age age gag 3674 
Asn Ser Asn Leu Val Cys Gly Val Ser Glu Ala Lys Asp Ser Ser Glu 
520 525 530 

tgt ggt tac gtg gca aca gat aaa gag gat gtt gcg gtt att ttt gat 3 722 
Cys Gly Tyr Val Ala Thr Asp Lys Glu Asp Val Ala Val He Phe Asp 
535 540 545 550 

gag aac aac get cag tta tct ttg ttt ctt aac egg gac tgg ttg ccg 3770 
Glu Asn Asn Ala Gin Leu Ser Leu Phe Leu Asn Arg Asp Trp Leu Pro 
555 560 565 

gat gaa gaa cga cgt gat aaa cgc tgg ctg act ccg acc ccg gag ggt 3816 
Asp Glu Glu Arg Arg Asp Lys Arg Trp Leu Thr Pro Thr Pro Glu Gly 
570 575 580 

gtc age gca ttt att cac cgc cag acg ctg tat ctg agt gat gat etc 3 866 
Val Ser Ala Phe He HiB Arg Gin Thr Leu Tyr Leu Ser Asp Asp Leu 
585 590 595 

cac agt cgt aat atg aca ctg aat ggt age ggt gec ctg ggg ctt ggt 3914 
His Ser Arg Asn Met Thr Leu Asn Gly Ser Gly Ala Leu Gly Leu Gly 
600 60S 610 

gac ggt cgt tat ctg gga ggc gac tgg gcg get ate tgg aat cag tea 3 962 
Asp Gly Arg Tyr Leu Gly Gly Asp Trp Ala Ala He Trp Asn Gin Ser 
615 620 625 630 

gaa cat tac aat aac agt cag gec tgg ttt gac aat ctg ttt gtc cgt 4010 
Glu His Tyr Asn Asn Ser Gin Ala Trp Phe Asp Asn Leu Phe Val Arg 
635 640 645 

cag gat etc ggc aat cag tat tat etc cag get ggt egg atg gat cag 4058 
Gin Asp Leu Gly Asn Gin Tyr Tyr Leu Gin Ala Gly Arg Met Asp Gin 
650 655 660 

egg aat ctg tec age gee acg ggg ggg gat ttt ggg ttc agt ctg ctt 4106 
Arg Asn Leu Ser Ser Ala Thr Gly Gly Asp Phe Gly Phe Ser Leu Leu 
665 670 675 

ccc ctg age egg ttt gat gga tta cga acc ggg acc acc caa get tat 4154 
Pro Leu Ser Arg Phe Asp Gly Leu Arg Thr Gly Thr Thr Gin Ala Tyr 
680 685 690 

gtt aac cat gag gtg gac cat aat gee act ccg gtt atg gtt cag gtt 4202 
Val Asn His Glu Val Asp His Asn Ala Thr Pro Val Met Val Gin Val 
695 700 705 710 
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acc cga aat gcc cgt att gat att tat cgt ggc age gag ttg ctg ggg 4250 
Thr Arg Aan Ala Arg lie Asp lie Tyr Arg Gly Ser Glu Leu Leu Gly 
715 720 72S 

agt cag ttc ctg acc ccg gga atg cat acc ctg gat act cat tct ctt 4298 
Ser Gin Phe Leu Thr Pro Gly Met His Thr Leu Asp Thr His Ser Leu 
730 735 740 

cca ccg gga age tat cct ctg gcg ttg egg gtg tat gag gat ggg att 4346 
Pro Pro Gly Ser Tyr Pro Leu Ala Leu Arg Val Tyr Glu Asp Gly lie 
745 750 755 

ctg egg cga acg gag acc cag ccc ttc agt aag ggg ggc aat age ttc 4394 
Leu Arg Arg Thr Glu Thr Gin Pro Phe Ser Lye Gly Gly Asn Ser Phe 
760 765 770 

agt gca cag acc cag tgg ttt att cag ggc ggg ctg gaa gat acc ggg 4442 
Ser Ala Gin Thr Gin Trp Phe He Gin Gly Gly Leu Glu Asp Thr Gly 
775 780 785 790 

gat aaa gcc age cat tat gac ggt gag act gtc atg get gcc gga ttc 4490 
Asp Lys Ala Ser His Tyr Asp Gly Glu Thr Val Met Ala Ala Gly Phe 
795 800 805 

caa act ggg ctg egg aaa aat ate agt ctg acc gaa ggt ate tct ctg 4538 
Gin Thr Gly Leu Arg Lys Asn He Ser Leu Thr Glu Gly He Ser Leu 
810 815 820 

gca cat gag gcc tgg tac agt gaa acc cga ctg aat tea cag cat gca 45 86 
Ala His Glu Ala Trp Tyr Ser Glu Thr Arg Leu Asn Ser Gin His Ala 
825 830 835 

gtg ctg gat ggc acg ctg gac ctt tct gcc ggg ata ctg cat ggg aca 4634 
Val Leu Asp Gly Thr Leu Asp Leu Ser Ala Gly He Leu His Gly Thr 
840 845 850 

gac age acg age ggt aac act gag cag gtg aca tac aac gac gga ttt 4682 
Asp Ser Thr Ser Gly Asn Thr Glu Gin Val Thr Tyr Asn Asp Gly Phe 
855 860 865 870 

tec gcg agt ctg tgg cgt aac cat acg gaa agt gat gcc tgt agt ggt 4 730 
Ser Ala Ser Leu Trp Arg Asn His Thr Glu Ser Asp Ala Cys Ser Gly 
875 880 885 

cgt cat cca cag tea gtg cat gcc agt atg acc tgc cag act teg atg 4778 
Arg His Pro Gin Ser Val His Ala Ser Met Thr Cys Gin Thr Ser Met 
890 895 900 

aac gcc tec ctg teg gtt teg gtg ggg aac tgg tat gcc eta ctg gga 4826 
Asn Ala Ser Leu Ser Val Ser Val Gly Aen Trp Tyr Ala Leu Leu Gly 
90S 910 915 

tac agt acc age agg aca gaa ggt egg ccg gtt tac egg gga tat gat 4 874 
Tyr Ser Thr Ser Arg Thr Glu Gly Arg Pro Val Tyr Arg Gly Tyr Asp 
920 925 930 

gat aac agt gac aaa gaa aat gtg ttc tgg cga cag gca tac ate cct 4 922 
Asp Asn ser Asp Lys Glu Asn Val Phe Trp Arg Gin Ala Tyr He Pro 
935 940 945 950 
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gcc tct cac cgc gaa tct get cag get agt gca acg tac age ctt aat 4 970 
Ala Ser His Arg Glu Ser Ala Gin Ala Ser Ala Thr Tyr Ser Leu Asn 
955 960 965 

atg get ggc atg aat att aat acc cat ggg gga gta tgg cga acc cga 5018 
Met Ala Gly Met Asn lie Asn Thr His Gly Gly Val Trp Arg Thr Arg 
970 975 980 

aat gac gga gtg .aat gat gat ggc ttg ttt atg agt gtc agt gtg tea 5066 
Asn Asp Gly Val Asn Asp Asp Gly Leu Phe Met Ser Val Ser Val Ser 
9B5 990 995 

tat gcc tct caa cca ccg aca atg act ggc agt aat agg tat acc tea 5114 
Tyr Ala Ser Gin Pro Pro Thr Met Thr Gly Ser Asn Arg Tyr Thr ser 
1000 1005 1010 



gcc ggg acc gat att cac agt age egg aat caa aaa aca cag acg tec 5162 
Ala Gly Thr Asp lie His Ser Ser Arg Asn Gin Lys Thr Gin Thr Ser 
1015 1020 1025 1030 

tgg aat gtg aac cat gtg aga tec tgg cag cag gat ctg tat cgt gaa 5210 
Trp Asn Val Asn His Val Arg Ser Trp Gin Gin Asp Leu Tyr Arg Glu 
1035 1040 1045 

ctg teg gtg ggt ttc tec ggt tat aac gac gac age tgg age ggg agt 525B 
Leu Ser Val Gly Phe Ser Gly Tyr Asn Asp Asp Ser Trp Ser Gly Ser 
1050 1055 1060 

etc ggc gga cgc atg age ggc cgt atg ggt gaa ctg age gcc act ate 5306 
Leu Gly Gly Arg Met Ser Gly Arg Met Gly Glu Leu Ser Ala Thr lie 
1065 1070 1075 

agt aac tec cat caa cgt aat gcg ggc age gcc agt tea etc acc get 5354 
Ser Asn Ser His Gin Arg Asn Ala Gly ser Ala Ser Ser Leu Thr Ala 
1080 1085 1090 

ggc tac age teg tct ctg gcg tta tec cgt aat gga ctg ttc tgg gga 5402 
Gly Tyr Ser Ser Ser Leu Ala Leu Ser Arg Asn Gly Leu Phe Trp Gly 
1095 1100 1105 1110 

ggt ggt cag gac ggt gaa ccg gcc tct ggc atg gcg gtg aac gtg gag 5450 
Gly Gly Gin Asp Gly Glu Pro Ala Ser Gly Met Ala Val Asn Val Glu 
1115 1120 1125 

tea gag ggg gac gag ggc agt age ggg aaa gta gtc age gtt cgt ggc 5496 
Ser Glu Gly Aap Glu Gly Ser Ser Gly Lys Val Val Ser Val Arg Gly 
1130 1135 1140 

age age cag ccg ttc agt etc ggt ttt ggt cag cag teg ctg ttg ctg 5546 
Ser Ser Gin Pro Phe Ser Leu Gly Phe Gly Gin Gin Ser Leu Leu Leu 
1145 1150 1155 

atg gaa ggc tat aac gcc acg gag gtg acc att gag gat gca ggg gtt 5594 
Met Glu Gly Tyr Asn Ala Thr Glu Val Thr lie Glu Asp Ala Gly Val 
1160 1165 1170 

agt tea cag ggt atg gca ggc gta aaa gcg gga ggg gga age agg tgt 5642 
Ser Ser Gin Gly Met Ala Gly Val Lys Ala Gly Gly Gly Ser Arg Cys 
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13-7S H80 1185 1190 

tac ttc ctg aca ccc ggg cat ctg ctg gtt cac aac ate age gee agt 5690 
Tyr phe Leu Thr Pro Gly His Leu Leu Val His Asn He Ser Ala Ser 
1195 1200 1205 

atg age cga ctg tac gtt ggc cgc gta ctg gac aag gat ggc aga ccg 573 8 
Met Ser Arg Leu Tyr Val Gly Arg Val Leu Asp Lys Asp Gly Arg Pro 
1210 1215 1220 

ctg ctg gac gca cag cca ctg aac tat cca ttt ttg teg ttg gga cct 5786 
Leu Leu Asp Ala Gin Pro Leu Asn Tyr Pro Phe Leu Ser Leu Gly Pro 
1225 1230 1235 

tec ggg cga ttt age ctg cag age gag cat aaa gaa tec age ctg tgg 5834 
Ser Gly Arg Phe Ser Leu Gin Ser Glu His Lys GIu Ser Ser Leu Trp 
1240 1245 1250 

ctg ctg tct aaa aac agg ate ctg cgt tgt ccg atg tea gta cat aaa 5882 
Leu Leu Ser Lys Asn Arg He Leu Arg Cys Pro Met Ser Val His Lys 
1255 1260 1265 1270 

cgt egg gat gtt atg cag gta gtg ggt gat gtg egg tgt gaa tta agt 5930 
Arg Arg Asp Val Met Gin Val Val Gly Asp Val Arg Cys Glu Leu Ser 
1275 1280 1285 

gac gtg gat gee ctg cca cag gcg ttg caa ata teg ccg egg gtc ate 5978 
Asp Val Asp Ala Leu Pro Gin Ala Leu Gin lie Ser Pro Arg Val He 
1290 1295 1300 

cgt ttg ctg aac gtg gca ggt ttg ctg cgc cat tec gtt cag gaa gee 6026 
Arg Leu Leu Asn Val Ala Gly Leu Leu Arg His Ser Val Gin Glu Ala 
1305 1310 1315 

tga cgtagagata aaggcgttaa ct atg agt aat aaa atg aag tgg acg agt 6078 

Met Ser Asn Lys Met Lys Trp Thr Ser 
1320 1325 

atg aca gee cat tgg tea gca att att aat ttc ate cga aaa tat gtt 6126 
Met Thr Ala His Trp Ser Ala He He Asn Phe He Arg Lys Tyr Val 
1330 1335 1340 

tat cca gca agg ata att gee ate ctg ctg atg get ggc get aca ctg 6174 
Tyr Pro Ala Arg He He Ala He Leu Leu Met Ala Gly Ala Thr Leu 
1345 1350 1355 1360 

cca caa gtc gee gat gcg att acc gtc gac ctg aat tac gac aag aac 6222 
Pro Gin val Ala Asp Ala He Thr Val Asp Leu Aen Tyr Asp Lys Asn 
1365 1370 1375 

aat gta gcg gtc ate act cct gtc tgg tec caa gaa tgg agt gta gca 627 0 
Asn Val Ala Val He Thr Pro Val Trp Ser Gin Glu Trp Ser Val Ala 
1380 1385 1390 

aat gtg ttg ggg gga tgg gta tgt cgt tea aac agg aat gaa aat gag 6318 
Asn Val Leu Gly Gly Trp Val Cys Arg Ser Asn Arg Asn Glu Asn Glu 
1395 1400 1405 

ggg gcg tgt gaa gaa aca cat ttg gta tgg tgg tat get ttt gga get 6366 
Gly Ala Cys Glu Glu Thr His Leu Val Trp Trp Tyr Ala Phe Gly Ala 
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1410 141S 1420 

tat tea aaa att cgt ctg cgt ttc aga gaa caa ate age cat gec gaa 6414 
Tyr Ser Lys He Arg Leu Arg Phe Arg Glu Gin He Ser Hie Ala Glu 
1425 1430 1435 1440 

att acg etc ata ctg etc ggc agt gtt cgt gat gec tgt tat act ggt 6462 
He Thr Leu He Leu Leu Gly Ser Val Arg Asp Ala Cya Tyr Thr Gly 
1445 1450 1455 

gtc ate aac atg aac get get gca tgt caa tgg ggt agg teg ctg aaa 6510 
Val lie Asn Met Asn Ala Ala Ala Cys Gin Trp Gly Arg Ser Leu Lys 
1460 146S 1470 

ctt agg ata cct tea gaa gag ctt gcg aag ata cct aca age gga aca 6558 
Leu Arg He Pro Ser Glu Glu Leu Ala Lys He Pro Thr Ser Gly Thr 
1475 1480 148S 

tgg aaa gca acg tta gtc ctg gat tat tta caa tgg ggc gga gac gat 6606 
Trp Lys Ala Thr Leu Val Leu Asp Tyr Leu Gin Trp Gly Gly Asp Asp 
1490 1495 1500 



cct tta ggc aca tea act aca gat ate acg ctg aat gta aca gac cac 6654 
Pro Leu Gly Thr Ser Thr Thr Asp He Thr Leu Asn Val Thr Asp His 
1505 1510 1515 1520 

ttt get gaa aat gcg get att tac ttt ccg caa ttt ggt aca gca acg 6702 
Phe Ala Glu Asn Ala Ala He Tyr Phe Pro Gin Phe Gly Thr Ala Thr 
1525 1530 1535 

ccc egg gtg gac ctg aat ctt cac egg atg aat gec tea caa atg teg 67 50 
Pro Arg Val Asp Leu Asn Leu His Arg Met Asn Ala Ser Gin Met Ser 

1540 1545 1550 

ggc agg get aat ctg gat atg tgt ctg tat gac gga ggt gtg aaa gec 6 798 
Gly Arg Ala Asn Leu Asp Met Cya Leu Tyr Asp Gly Gly Val Lys Ala 
1555 1560 1565 

cgt tea tta cag atg aag ata gaa gga age aat aag tea ggt acg gga 6846 
Arg Ser Leu Gin Met Lys He Glu Gly Ser Asn Lys Ser Gly Thr Gly 
1570 1575 1580 

ttt cag gtt ata aag age gat tct get gat acg att gat tat gcg gtc 6894 
Phe Gin Val He Lys Ser Asp Ser Ala Asp Thr He Asp Tyr Ala Val 
1585 1590 1595 1600 

agt atg aat tat ggg gga cga agt att cct gtc acc cgt ggc gtg gag 6942 
Ser Met Asn Tyr Gly Gly Arg Ser He Pro Val Thr Arg Gly Val Glu 
1605 1610 1615 

ttc agt ctg gat aac gtg gat aaa gca gca acg cgt ccg gtg gta ctt 6990 
Phe Ser Leu Asp Asn Val Asp Lys Ala Ala Thr Arg Pro Val Val Leu 
1620 1625 1630 

ccc ggg caa egg cag gcg gta cgt tgt gtg cca gtg ccc ctt acc ctg 7038 
Pro Gly Gin Arg Gin Ala Val Arg Cys Val Pro Val Pro Leu Thr Leu 
1635 1640 1645 
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aca aca caa ccc ttt aac ate aga gag aag cgt tct ggt gag tat cag 7086 
Thr Thr Gin Pro Phe Asn lie Arg Glu LyB Arg Ser Gly Glu Tyr Gin 
1650 1655 1660 

gga acg ctg aca gtg aca atg ctg atg gga aca caa acc ccc tga 7131 
Gly Thr Leu Thr Val Thr Met Leu Met Gly Thr Gin Thr Pro 
1665 1670 167S 

cagtaattat ttattttatt gatatctttc ttatatggtt ttttaaatca gagttctctt 7191 

tatatacttg ttttatttaa taaagagaat ctattcactt atgaaaatca atgcgtgagg 7251 

ttctgettte ct atg act gtg tat tta gat gat aaa gat aaa gaa tta ttg 7302 
Met Thr Val Tyr Leu Asp Asp Lys Asp Lys Glu Leu Leu 
1680 1685 1690 

aaa gaa ate caa aaa gat tgt gca caa act tta tgg caa ctt gca tat 7350 
Lys Glu He Gin Lys Asp Cys Ala Gin Thr Leu Trp Gin Leu Ala Tyr 
1695 1700 1705 

aaa gtg gga ctt acg ccc aca cca tgt ttc aaa cgt tta aaa aaa ctt 73 98 
Lys Val Gly Leu Thr Pro Thr Pro Cys Phe Lys Arg Leu Lys Lys Leu 
1710 1715 1720 

aaa gac agg ggg gtt ate att ggt cag ttc get tta ttg gat aag gaa 7446 
Lys Asp Arg Gly Val He He Gly Gin Phe Ala Leu Leu Asp Lys Glu 
1725 1730 1735 1740 

aaa eta ggt ctt tea ctt aat gtc ttt att atg att aac ata tct gag 7494 
Lys Leu Gly Leu Ser Leu Asn Val Phe He Met He Asn He Ser Glu 
1745 1750 1755 

gag caa tac get agt att tct gag aaa ata aag tea atg cct gag gtt 7542 
Glu Gin Tyr Ala Ser He Ser Glu Lys He Lys Ser Met Pro Glu Val 
1760 1765 1770 

att gee ttc tat cga att tct gga tea ttt aat tat tta atg cat aca 7590 
He Ala Phe Tyr Arg He Ser Gly Ser Phe Asn Tyr Leu Met His Thr 
1775 1780 1785 

gta ttt aca gat atg aac gat tac tat agt ttt tat gag aaa ata ata 7638 
Val Phe Thr Asp Met Asn Asp Tyr Tyr Ser Phe Tyr Glu Lys He He 
1790 1795 1800 

tta act aat tct tea att agt gga tct gca teg age ttt gtt ctt gag 7686 
Leu Thr Asn Ser Ser He Ser Gly Ser Ala Ser Ser Phe Val Leu Glu 
1805 1810 1815 1820 

caa ata aag gaa aca aac gaa ctg tea gtg tga aagtgtgatg tgtacttact 7739 
Gin He Lys Glu Thr Asn Glu Leu Ser Val 
1B25 1830 

gatttaatac attattatcc ttcttacgga acaacaaegg cagattgegg ctgttgaaca 7799 

aggattttaa tcagcagtgg tgaaattaag eggcacagaa taacacagcg gaatatcaca 7859 

tggttaaata tcaccccgtg catgtaacaa aaaacegcat taaaacagat gatgttactg 7919 

atatttattt cgttgaaccc ttctggaaaa aaggcgaaaa ccacataatt gagtcattga 7979 
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tgttttttga 


agagttacaa 


aagtcattta 


atttattcaa 


ccataaatat 


gggttaaata 


8039 


aatatatact 


caggatcccc 


tgggaatttg 


tgctcataca 


tatggaaagg 


atcagtaaat 


8099 


taaatagcgt 


cgggttattt 


gctgtttctg 


ttgactttaa 


taacaaccac 


aaatttctga 


8159 


gcgagtacat 


caggagtege 


agagattatg 


gtatggaagt 


ttggtttgat 


ttttgtggta 


8219 


aacattctta 


ttccagtgaa 


attaaaaacc 


ttggattctt 


ttttcaggct 


tgcgtagtgc 


8279 


ctcgtgatcc 


taattttatt 


agtagtgttt 


atcattatca 


taagttccaa 


aagattcttg 


8339 


tcggggatat 


aaatgatgta 


gaacagaggg 


ccgtgtacca 


gaacgaagtt 


gattacatgt 


8399 


atggaatgca 


atggccatcg 


tcatatgacg 


gttttttctt 


tegggatcat 


aaaaaaaatg 


8459 


aaacttggtg 


tatataacag 


aaggagtgaa 


aatttgaatc 


aaaaatatct 


tatttatttt 


8519 


ttgtttaatt 


attgttttgt 


tttttattac 


gattaaatat 


aaagaacatc 


attgttcgtg 


8579 


cggtggggag 


geeggaagtc 


taggggatga 


ccgtttatca 


acaattttat 


tacagccacc 


8639 


acacgaatgg 


tttatatatg 


cactagatgc 


attattttag 


tttaatatat 


cgatggttgc 


8699 


tatttgcatt 


gatgatgttc 


cgttacatta 


aggaatatac 


atctgtatct 


cgttatacgc 


8759 


acactcacat 


tactaatcat 


tattaatatg 


agtgtggttc 


ttgttttacg 


catgcatggt 


8829 


tgcatgtgac 


gttaaattta 


aatgagctga 


ctgtatgaat 


tctaaatact 


ttagagaggt 


6879 


gttttttgtc 


tcggtagttg 


ttatattatt 


attttatttg 


gtgttatttg 


cagccagtgc 


8939 


tea tgctgaa 


ggcggtttca 


gatctggagg 


cattgggtta 


ettatgaegg 


gaacaagaga 


B999 


gatgetactg 


tagagataat 


aaattctget 


aaagattccc 


caattcttgt 


gcattgacat 


9059 


cctccacgtc 


ctgaagggcg 


tgggttcctg 


ctccaacggg 


ctgcctgact 


gcacgctcct 


9119 


tccacaggca 


ageaeggegt 


gtcccgctct 


aaaatgttac 


gcgcgccgtt 


tacateggeg 


9179 


ttegcagtat 


atcttcatac 


cagacacttg 


taagtatctc 


geataategt 


gccattcaca 


9239 


tttagagatc 


atac 
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<210> 2 
<211> 236 
«212> PRT 

<;213> Salmonella typhi 
<400> 2 

Met Asn Phe Lys Asp Thr Leu Pro Gly Val Phe Leu Cys Val Ala Met 
15 10 15 

Phe Ala Cys Gly His Ala Arg Ala Asn Met Leu Val Tyr Pro Met Ala 
20 25 30 



Ala Glu lie Asn Ser Ser Arg Glu Glu Ala Thr Ser Leu Phe Val Tyr 
35 40 45 
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Ser Lys Ser Asp His Val Gin Tyr He Arg Thr Arg He Met Arg He 
50 55 60 

Glu His Pro Gly Met Pro Gin Glu Lys Glu Val Pro Ala Gly Asn Asp 
65 70 75 80 

He Glu Thr Gly Leu Val Val Ser Pro Glu Lys Phe Ala Leu Ser Pro 
85 90 95 

Gly Thr Lys Lys Thr He Arg Val He Ser Thr Gin Ala Pro Glu Arg 
100 105 110 

Glu Glu Ala Trp Arg Val Tyr Phe Glu Ala Val Pro Glu Leu Glu Asp 
115 120 125 

Asp Pro Gin Ala Gly Gly Lys Gin Asn Ser Ser Val Ser Val Asn Leu 
130 135 140 

Val Trp Gly Val Leu Leu Arg Val Ser Pro Ser Asp Pro Arg Pro Ala 
145 150 155 160 

Leu val Thr Asp Gly His His Leu Leu Asn Thr Gly Asn Thr Arg Leu 
165 170 175 

Ser Leu He Arg Ala Gly Asn Cys Asp Thr Thr Cys His Trp Gin Asn 
180 185 190 

He Gly Lys Ser He Tyr Pro Gly Gly Ser Ala Asp He Pro Ala Gly 
195 200 205 

He Lys Ser Asn Ala Phe Arg Val Glu Tyr Arg Thr Gly Ala Asn Ser 
210 215 220 



Pro Val lie Ser Ala Asp Leu Thr Ala Ala Gly Lys 
225 230 235 



«:210> 3 
<211> 191 
<212> PRT 

<213> Salmonella typhi 
<400> 3 

Met Tyr Thr Glu Cys Thr Tyr He Thr Val He Asn Asn Lys Ala Arg 
15 10 15 

Leu Phe Phe Met Asn Met Lys Thr Ser Phe He Ala Ala Ala Val Ala 
20 25 30 

Leu Ala Thr Val Tyr Ser Phe Ser Val Ser Ala Val Gin Lys Asp He 
35 40 45 

Thr Val Thr Ala Asn He Asp Ser Thr Leu Glu Leu Leu Gin Ala Asp 
50 55 60 

Gly ser Ser Leu Pro Ser Thr Met Lys Leu Asp Phe Met Pro Gly Lys 
65 70 75 80 



Gly Leu Val His Lys Ser Leu Gin Thr Arg Leu Tyr Ser Asn Asp Gin 
85 90 95 



mm 



61 



■J::- v. £ *r 



Thr Lys Ser Val Asn Val Lys Leu 

100 

Val Leu Asp Pro Thr Lys Thr He 
115 120 

Arg Ser Leu Thr Thr Thr Asn Ser 
130 135 

Pro Asp Gly Lys Thr Gly Asp Ala 
145 150 

Gly Gin Lys Ala Gly Ala Ala Leu 
165 

Ser Gly Leu Val Ser Leu Val He 
180 



Leu Asn Ala Pro Gin Leu He Asn 
105 110 

Asp Met Glu val Thr Leu Gly Gly 
125 

Val Leu Glu Ala Lys Thr Leu Phe 
140 

Ser Ala Leu Leu Asn Leu Asp He 
155 160 

Gin Asn Leu Pro Ala Gly Glu Tyr 

170 175 

Ser Gin Ala Val Thr Ala Gly 

185 190 



<210> 4 
<211> 889 
<212> PRT 

<213> Salmonella typhi 
<400> 4 

Met Tyr Tyr Leu Leu Gly Leu Cys Ser Phe Thr Ser Gin Ala Thr Leu 
15 10 15 

He pro Pro Pro Gly Phe Glu Ser Leu Leu Glu Gly Gin Thr Glu Gin 
20 25 30 

He Glu Val Leu Leu Pro Gly His Ser Leu Gly Leu Phe Pro Val Val 
35 40 45 



Val Lys Pro Asp Thr Val Gin Phe Met Ser Pro Leu Met Val Leu Glu 
SO SS 60 

Ser Ser Gly Leu Ala Ala Leu Pro Ala Ala Glu Arg Gin Lys Ala Leu 
65 70 75 80 

Ala Ala Leu Ser Arg Pro Leu Leu Arg Asn Ser Asn Leu Val Cys Gly 
85 90 95 

val ser Glu Ala Lys Asp Ser Ser Glu Cys Gly Tyr Val Ala Thr Asp 
100 105 110 

Lys Glu Asp Val Ala Val He Phe Asp Glu Asn Asn Ala Gin Leu Ser 
115 120 125 

Leu Phe Leu Asn Arg Asp Trp Leu Pro Asp Glu Glu Arg Arg Asp Lys 
130 135 140 

Arg Trp Leu Thr Pro Thr Pro Glu Gly Val Ser Ala Phe He His Arg 
145 150 155 160 



Gin Thr Leu Tyr Leu Ser Asp Asp Leu Hie Ser Arg Asn Met Thr Leu 
165 170 175 
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Asn Gly Ser Gly Ala Leu Gly Leu Gly Asp Gly Arg Tyr Leu Gly Gly 
180 IBS 190 

Asp Trp Ala Ala lie Trp Asn Gin Ser Glu His Tyr Asn Asn Ser Gin 
195 200 205 

Ala Trp Phe Asp Asn Leu Phe val Arg Gin Asp Leu Gly Asn Gin Tyr 
210 215 220 

Tyr Leu Gin Ala Gly Arg Met Asp Gin Arg Asn Leu Ser Ser Ala Thr 
225 230 235 240 

Gly Gly Asp Phe Gly Phe Ser Leu Leu Pro Leu. Ser Arg Phe Asp Gly 
245 250 255 

Leu Arg Thr Gly Thr Thr Gin Ala Tyr Val Asn His Glu Val Asp Hia 
260 265 270 

Asn Ala Thr Pro Val Met Val Gin Val Thr Arg Asn Ala Arg lie Asp 
275 280 285 

lie Tyr Arg Gly Ser Glu Leu Leu Gly Ser Gin Phe Leu Thr Pro Gly 
290 295 300 

Met His Thr Leu Asp Thr His Ser Leu Pro Pro Gly Ser Tyr Pro Leu 
305 310 315 320 

Ala Leu Arg Val Tyr Glu Asp Gly He Leu Arg Arg Thr Glu Thr Gin 
325 330 335 

Pro Phe Ser Lys Gly Gly Asn Ser Phe Ser Ala Gin Thr Gin Trp Phe 
340 345 350 

lie Gin Gly Gly Leu Glu Asp Thr Gly Asp Lys Ala Ser His Tyr Asp 
355 360 365 

Gly Glu Thr Val Met Ala Ala Gly Phe Gin Thr Gly Leu Arg Lys Asn 
370 375 380 

He Ser Leu Thr Glu Gly He Ser Leu Ala His Glu Ala Trp Tyr Ser 
385 390 395 400 

Glu Thr Arg Leu Asn Ser Gin His Ala Val Leu Asp Gly Thr Leu Asp 
405 410 415 

Leu Ser Ala Gly He Leu His Gly Thr Asp Ser Thr Ser Gly Asn Thr 
420 425 430 

Glu Gin Val Thr Tyr Asn Asp Gly Phe Ser Ala Ser Leu Trp Arg Asn 
435 440 445 

His Thr Glu Ser Asp Ala Cys Ser Gly Arg His Pro Gin Ser Val His 
450 455 460 

Ala Ser Met Thr Cys Gin Thr Ser Met Asn Ala Ser Leu Ser Val Ser 
465 470 475 480 



Val Gly Asn Trp Tyr Ala Leu Leu Gly Tyr Ser Thr Ser Arg Thr Glu 
485 490 495 
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Gly Arg Pro Val Tyr Arg Gly Tyr Asp Asp Asn Ser Asp Lye Glu Asn 
500 505 510 

Val Phe Trp Arg Gin Ala Tyr lie Pro Ala Ser His Arg Glu Ser Ala 
515 520 525 

Gin Ala Ser Ala Thr Tyr Ser Leu Asn Met Ala Gly Met Asn lie Aen 
530 535 540 

Thr His Gly Gly Val Trp Arg Thr Arg Asn Asp Gly Val Asn Asp Asp 
545 550 555 560 

Gly Leu Phe Met Ser Val Ser Val Ser Tyr Ala Ser Gin Pro Pro Thr 
565 570 575 

Met Thr Gly Ser Asn Arg Tyr Thr Ser Ala Gly Thr Asp lie His Ser 
580 BBS 590 

Ser Arg Asn Gin Lys Thr Gin Thr Ser Trp Aan Val Asn Hie Val Arg 
595 600 605 

Ser Trp Gin Gin Asp Leu Tyr Arg Glu Leu Ser Val Gly Phe Ser Gly 
610 615 620 

Tyr Asn Asp Asp Ser Trp Ser Gly Ser Leu Gly Gly Arg Met Ser Gly 
625 630 635 640 

Arg Met Gly Glu Leu Ser Ala Thr He Ser Asn Ser His Gin Arg Asn 
645 650 655 

Ala Gly Ser Ala Ser Ser Leu Thr Ala Gly Tyr Ser Ser Ser Leu Ala 
660 665 670 

Leu Ser Arg Asn Gly Leu Phe Trp Gly Gly Gly Gin Asp Gly Glu Pro 
675 680 685 

Ala Ser Gly Met Ala Val Asn val Glu Ser Glu Gly Asp Glu Gly Ser 
690 695 700 

Ser Gly Lys Val Val Ser Val Arg Gly Ser Ser Gin Pro Phe Ser Leu 
705 710 715 720 

Gly Phe Gly Gin Gin Ser Leu Leu Leu Met Glu Gly Tyr Asn Ala Thr 
725 730 735 

Glu Val Thr He Glu Asp Ala Gly Val Ser Ser Gin Gly Met Ala Gly 
740 745 750 

Val Lys Ala Gly Gly Gly Ser Arg Cys Tyr Phe Leu Thr Pro Gly His 
755 760 765 

Leu Leu Val His Asn lie Ser Ala Ser Met Ser Arg Leu Tyr Val Gly 
770 775 780 

Arg Val Leu Asp Lys Asp Gly Arg Pro Leu Leu Aap Ala Gin Pro Leu 
785 790 795 BOO 



Asn Tyr Pro Phe Leu Ser Leu Gly Pro Ser Gly Arg Phe Ser Leu Gin 
805 810 815 
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Ser Glu His Lye Glu Ser Ser Leu Trp Leu Leu Ser Lys Asn Arg lie 
820 825 830 

Leu Arg Cys Pro Met Ser Val His Lye Arg Arg Asp Val Met Gin Val 
835 840 845 

val Gly Asp Val Arg Cys Glu Leu Ser Asp Val Asp Ala Leu Pro Gin 
850 855 860 

Ala Leu Gin lie Ser Pro Arg Val He Arg Leu Leu Asn Val Ala Gly 
865 870 875 880 

Leu Leu Arg His Ser Val Gin Glu Ala 
665 



<210> 5 
*211> 359 
*212a. PRT 

<213> Salmonella typhi 
<400> 5 

Met Ser Asn Lys Met Lys Trp Thr Ser Met Thr Ala His Trp Ser Ala 
15 10 15 

lie He Asn Phe He Arg Lys Tyr val Tyr Pro Ala Arg He He Ala 
20 25 30 

He Leu Leu Met Ala Gly Ala Thr Leu Pro Gin val Ala Asp Ala He 
35 40 45 

Thr Val Asp Leu Asn Tyr Asp Lys Asn Asn Val Ala Val He Thr Pro 
50 55 60 

Val Trp Ser Gin Glu Trp Ser Val Ala Asn Val Leu Gly Gly Trp Val 
6S 70 75 80 

cys Arg Ser Asn Arg Asn Glu Asn Glu Gly Ala Cys Glu Glu Thr His 
85 90 95 

Leu Val Trp Trp Tyr Ala Phe Gly Ala Tyr Ser Lys He Arg Leu Arg 
100 105 110 

Phe Arg Glu Gin He Ser His Ala Glu He Thr Leu lie Leu Leu Gly 
115 120 12S 

Ser Val Arg Asp Ala Cys Tyr Thr Gly Val He Asn Met Asn Ala Ala 

130 135 140 

Ala Cys Gin Trp Gly Arg Ser Leu Lys Leu Arg He Pro Ser Glu Glu 
145 150 155 160 

Leu Ala Lys He Pro Thr Ser Gly Thr Trp Lys Ala Thr Leu Val Leu 
165 170 175 

Asp Tyr Leu Gin Trp Gly Gly Asp Asp Pro Leu Gly Thr Ser Thr Thr 
180 185 190 



Asp He Thr Leu Asn Val Thr Asp His Phe Ala Glu Asn Ala Ala He 
195 200 205 
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Tyr Phe Pro Gin Phe Gly Thr Ala Thr Pro Arg Val Asp Leu Aan Leu 
210 215 220 

His Arg Met Asn Ala Ser Gin Met Ser Gly Arg Ala Asn Leu Asp Met 
225 230 235 240 

Cye Leu Tyr Asp Gly Gly Val Lys Ala Arg Ser Leu Gin Met iys lie 
245 250 255 

Glu Gly Ser Asn Lys Ser Gly Thr Gly Phe Gin Val lie Lys Ser Asp 
260 265 270 

Ser Ala Asp Thr lie Asp Tyr Ala Val Ser Met Asn Tyr Gly Gly Arg 
275 280 285 

Ser lie Pro Val Thr Arg Gly Val Glu Phe Ser Leu Asp Asn Val Asp 
290 295 300 

Lys Ala Ala Thr Arg Pro Val val Leu Pro Gly Gin Arg Gin Ala Val 
305 310 315 320 

Arg Cys Val Pro Val Pro Leu Thr Leu Thr Thr Gin Pro Phe Asn lie 
325 330 335 

Arg Glu Lys Arg Ser Gly Glu Tyr Gin Gly Thr Leu Thr Val Thr Met 
340 345 350 

Leu Met Gly Thr Gin Thr Pro 
355 



<210> 6 
<211> 151 
<212> PRT 

<213> Salmonella typhi 
<400> 6 

Met Thr Val Tyr Leu Asp Asp Lys Asp Lys Glu Leu Leu Lys Glu lie 
15 10 15 

Gin Lys Asp Cys Ala Gin Thr Leu Trp Gin Leu Ala Tyr Lys Val Gly 
20 25 30 

Leu Thr Pro Thr Pro Cys Phe Lys Arg Leu Lys Lys Leu Lys Asp Arg 
35 40 45 



Gly Val lie He Gly Gin Phe Ala 
50 55 

Leu Ser Leu Asn Val Phe He Met 
65 70 

Ala Ser He Ser Glu Lys He Lys 
85 

Tyr Arg He Ser Gly Ser Phe Asn 
100 



Leu Leu Asp LyB Glu Lys Leu Gly 
60 

He Asn He Ser Glu Glu Gin Tyr 

75 80 

Ser Met Pro Glu Val He Ala Phe 
90 95 

Tyr Leu Met His Thr Val Phe Thr 
105 HO 
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Asp Met Asn Asp Tyr Tyr Ser Phe Tyr Glu Lys lie lie Leu Thr Asn 
115 120 125 

Ser Ser He Ser Gly Ser Ala Ser Ser Phe val Leu Glu Gin lie Lys 
130 135 140 

Glu Thr Asn Glu Leu Ser Val 
145 ISO 
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Claims: 



1 . Protein encoded by a nucleotide sequence selected from the group 
consisting of SEQ ID NO 1 and SEQ ID NO 2, or parts thereof, for use in 

5 medicine. 

2. Antibodies directed against the protein encoded by a nucleotide sequence 
selected from the group consisting of SEQ ID NO 1 and SEQ ID NO 2, or 
antigenic fragments thereof for use in medicine. 

10 

3. Nucleotide sequence selected from the group consisting of SEQ ID NO 1 
and SEQ ID NO 2, or parts thereof, for use in medicine. 

4. A vaccine for the protection against diseases caused by Salmonella, enterica 
15 subspecies I, comprising the protein, or parts thereof, encoded by the 

nucleotide sequence according to SEQ ID NO 1 or antibodies directed against 
the protein encoded by SEQ ID NO 1, or antigenic fragments thereof and, 
optionally, a pharmaceutically acceptable carrier. 

20 5. A vaccine for the protection against diseases caused by Salmonella enterica 
subspecies I serovar Typhi, comprising the protein, or parts thereof, encoded 
by the nucleotide sequence according to SEQ ID NO 2 or antibodies directed 
against the protein encoded by SEQ ID NO 2, or antigenic fragments thereof 
and, optionally, a pharmaceutically acceptable carrier. 

25 

6. A nucleic acid vaccine for the protection against diseases caused by 
Salmonella enterica subspecies I, comprising SEQ ID NO 1, or parts thereof 
and, optionally, a pharmaceutically acceptable carrier. 

30 7. A nucleic acid vaccine for the protection against diseases caused by 

Salmonella enterica subspecies I serovar Typhi, comprising SEQ ID NO 2, or 
parts thereof and, optionally, a pharmaceutically acceptable carrier. 

8. A vector vaccine for the protection against diseases caused by Salmonella 
35 enterica subspecies I, comprising a host in which a recombinant vector 

comprising SEQ ID NO 1, or parts thereof, has been inserted and, optionally, a 
pharmaceutically acceptable carrier. 




9. A vector vaccine for the protection against diseases caused by Salmonella 
enterica subspecies I serovar Typhi, comprising a host in which a recombinant 
vector comprising SEQ ID NO 2, or parts thereof, has been inserted and, 

5 optionally, a pharmaceutical^ acceptable carrier. 

10. A method for protection against diseases caused by Salmonella enterica 
subspecies I, comprising administering a vaccine according to any of claims 4, 
6, and 8. 

10 

11. A method for protection against diseases caused by Salmonella enterica 
subspecies I serovar Typhi, comprising administering a vaccine according to 
any of claims 5, 7, and 9. 

15 12. Antibodies directed against the protein encoded by a nucleotide sequence 
selected from the group consisting of SEQ ID NO 1 and SEQ ID NO 2, or 
antigenic fragments thereof, for use in a diagnostic method. 

13. Protein encoded by a nucleotide sequence selected from the group 

20 consisting of SEQ ID NO 1 and SEQ ID NO 2, or parts thereof, for use in a 
diagnostic method. 

14. Primers for or, probes that hybridize with a nucleotide sequence selected 
from the group consisting of SEQ ID NO 1 and SEQ ID NO 2, for use in a 

25 diagnostic method. 
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ABSTRACT 

The present invention is based on the finding that two fimbria! structures are 
5 specific for Salmonella enterica subspecies 1 bacteria. Due to their specificity 
they can be used to provide vaccines against Salmonella enterica subspecies I 
as well as for detection of Salmonella enterica subspecies I, 
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